Section translation is an expansion of the Content Translation capabilities. Section translation enables users to expand existing Wikipedia articles by translating new sections. In addition, Section Translation is designed to work on mobile devices (in addition to desktop), which enables users to translate that was not possible with Content Translation before.
+
The content translation events capture various aspects of user interactions with the content and section translation tools. This analysis is the first iteration of visualizing how users arrive through various entry points, flows, and how many reach the currently instrumented next stages.
+
90 days of data preceding 2023-12-31 was reviewed.
+
+
+
+
+
+
+
+
+Overall Summary
+
+
+
+
+
+
+
+
+
+Frequency of Entry Points Usage (by Edit Count Bucket of Users)
+
+
+
+
+
+
+
85% of the newcomers opened the translation dashboard by navigating from frequent language selector, which surfaces missing languages to translate for an article.
+
As users gain more editing experience, more users tend to reach the dashboard increasingly through content language selector, which they can search for missing language to translate for an article.
+
Also, experienced users tend to open the dashboard directly as compared to newcomers.
+
For users with 1000+ edits, frequent languages selector is only 40% of the time to navigate to the dashboard.
+
+
+
+
+
+
+
+
+
+
+Frequency of Entry Points Usage (by comparative size of target language Wikipedia)
+
+
+
+
+
+
+
On larger Wikipedias, frequent languages selector was most used to navigate to the translation dashboard.
+
+
Among the top 20 Wikipedias, it was used 85% of the time to access the dashboard.
+
+
On smaller Wikipedias, although frequent language selector remains the most accessed, usage of content language selector is more compared to larger Wikipedia.
+
This is related to the observations from the user edit bucket, larger Wikipedias tend to have more newcomers compared to smaller Wikipedias.
+
+
+
+
+
+
+
+
+
+
+Translation Start Screen
+
+
+
+
+
+
+
Only 7% of the dashboard_translation_start occurred independently of dashboard_open events.
+
+
That indicates that most of the users start the translations by already selecting an article/section to translate from an external entry point.
+
+
Among the ones who initiate dashboard_translation_start independently
+
+
Majority of the newcomers start a translation by accepting suggestions by the API in the absence of a seed article.
+
Majority of the experienced users start a translation by choosing the results of a search, followed by accepting a translation suggested because it is related to one of their recent edits.
In most cases (77%), those who opened the dashboard transitioned to the translation start screen.
+
+
This is because for users navigating to the dashboard from an external entry point, both events occur consecutively.
+
13% ended the session and 8% refreshed the dashboard or came back to it later before the session expired.
+
+
Among the users who proceeded to the start screen, only in 15% of the cases they progressed to the editor and made an edit.
+
+
In 46% of the cases, users went back to the main dashboard, and 30% ended the session.
+
As most of the events were generated by users with 0 edits (newcomers), this is largely influenced by those events.
+
+
Among users who made at least one edit, in 80% of the cases, they continued to make additional edits, while 9% went back to the main dashboard, and the rest ended the session.
Across all edit count buckets, most of the users (>70%) who opened the dashboard proceeded to the translation start screen.
+
+
The percentage is higher for newcomers compared to experienced users. This is because most newcomers reach the dashboard through external entry points rather than directly opening the dashboard, in which case, both dashboard_open and dashboard_translation_start are consecutively triggered (with no user action in between), whereas, among experienced users, more users open the dashboard directly and then click to proceed translation start screen.
+
+
Among users who reached the translation start screen
+
+
Newcomers tend to end/abandon the session or return to the main dashboard
+
Only in 12% of the cases, newcomers continued to make an edit from this stage, whereas users with 1000+ made an edit in 32% of the cases.
+
+
With higher the editing experience, the more likely that users will continue to make an edit
+
+
+
Among users who made at least one, with increasing editing experience, the more likely that users will continue to make additional edits to the machine-translated content, and less likely to end the session or return to the dashboard.
The rate of transition between various stages of the funnel by the source of entry is highly correlated to the usage of the respective entry point by various user experience levels.
+
+
+
Among users who navigated through frequent languages menu, which was most frequently accessed by newcomers:
+
+
In 82% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 13% of the cases.
+
+
Among users who navigated through content language sector, which was frequently accessed by both newcomers and experienced users alike:
+
+
In 75% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 23% of the cases.
+
+
Among users who directly opened the dashboard, most frequently by experienced users:
+
+
In 36% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 34% of the cases.
+
+
Among users who navigated from an invitation shown on a non-existent page, which was frequently accessed by both newcomers and experienced users alike:
+
+
In 80% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 14% of the cases.
+
+
Among users who directly opened the dashboard with link to specific translation, most frequently by experienced users:
+
+
In 87% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 11% of the cases.
+
+
Among users who navigated from contributions page, which was frequently accessed by both newcomers and experienced users alike:
+
+
In 39% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 7% of the cases.
+
+
Among users who navigated from notice on recently translated articles to review/expand the translation, most frequently by experienced users:
+
+
In 88% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 31% of the cases.
+
+
+
+
+
+
+
+
+
+
+
+Flow of Users (other events)
+
+
+
+
+
+
+
In cases where users discarded a suggested translation (677 occurrences), in 80% of the cases they continued to discard the next translation show as well, and 10% proceeded to the translation start screen.
+
In cases where users requested that the list of suggestions be regenerated (110 occurrences), in 33% of the cases they refreshed the suggestions again, and 20% proceeded to the translation start screen.
+
In cases where users initiated a search (958 occurrences), in 82% of the cases they proceeded to the translation start screen, and 12% returned to the dashboard.
+
In cases where users selected an in-progress translation (440 occurrences), in 67% of the cases they returned to the dashboard, and 16% made an edit to the translation.
+
In cases where users discarded an in-progress translation (132 occurrences), in 58% of the cases they discarded additional in-progress translations, and 13% initiated a search.
+
+
+
+
+
+
+
+
Data Gathering
+
+
Setup
+
+
+Code
+
import wmfdata as wmf
+import pandas as pd
+from datetime import datetime, timedelta
+import great_tables as gt
+
+import plotly.express as px
+import plotly.graph_objects as go
+import plotly.subplots as sp
+from plotly.offline import download_plotlyjs, init_notebook_mode, iplot
+
+from IPython.display import display_html, display, HTML, clear_output, Markdown
+
+import warnings
+
+
+
+
+Code
+
init_notebook_mode(connected=True)
+
+pd.options.display.max_columns =None
+pd.options.display.max_rows =250
+
+# width for charts
+iplot_width =950
+max_width =1250
+
+# always show options bar
+iplot_config = {'displayModeBar': True}
+
+# prints a string at center of the output, bold if needed
+def pr_centered(content, bold=False):
+if bold:
+ content =f"<b>{content}</b>"
+
+ centered_html =f"<div style='text-align:center'>{content}</div>"
+
+ display(HTML(centered_html))
CPU times: user 4.41 s, sys: 723 ms, total: 5.14 s
+Wall time: 2min 8s
+
+
+
+
+Code
+
edit_buckets_across_all_sessions = (
+ all_events[['content_translation_session_id', 'user_global_edit_count_bucket']]
+ .user_global_edit_count_bucket
+ .value_counts(normalize=True)
+ .reset_index()
+ .rename({
+'user_global_edit_count_bucket': 'Edit Bucket',
+'proportion': 'Percentage of events'
+ }, axis=1)
+ .sort_values('Percentage of events', ascending=False, ignore_index=True)
+)
+
+edit_buckets_across_all_sessions['Percentage of events'] = edit_buckets_across_all_sessions['Percentage of events'].apply(lambda x:f"{x:.2%}")
+pr_centered(f'Distribution of edit buckets across all sessions', True)
+edit_buckets_across_all_sessions
+
+
+
Distribution of edit buckets across all sessions
+
+
+
+
+
+
+
+
+
+
Edit Bucket
+
Percentage of events
+
+
+
+
+
0
+
0 edits
+
53.93%
+
+
+
1
+
1000+ edits
+
19.20%
+
+
+
2
+
5-99 edits
+
10.76%
+
+
+
3
+
100-999 edits
+
9.87%
+
+
+
4
+
1-4 edits
+
6.24%
+
+
+
+
+
+
+
+
+
+
Data Cleaning
+
During analysis, several issues related to the events produced were identified. The most significant issue was with content_translation_session_position where multiple events belong to different and same event types although occurred at different times, have the same session position. Currently, we are not sure whether the session position was being recorded incorrectly, in which it can be re-constructed based on the timestamp, or if they are duplicate events. More information and task to investigate these issues are at T353882. For this analysis, all sessions with potentially erroneous events will not be considered.
+
+
+Code
+
temporal_columns = ['ts', 'dt', 'hour']
+
+# sessions with duplicate events expect for the temporal columns
+sessions_with_duplicate_events = (
+ all_events[[col for col in all_events.columns.tolist() if col notin temporal_columns]]
+ .value_counts()
+ .reset_index()
+ .rename({0: 'count'}, axis=1)
+ .query("""count > 1""")
+ .content_translation_session_id
+ .unique()
+ .tolist()
+)
+
+# various event types in a session having same session position althoguh the events occured later
+session_event_counts = (
+ all_events.groupby(['content_translation_session_id', 'content_translation_session_position'])
+ .agg(distinct_events=('event_type', pd.Series.nunique))
+)
+
+sessions_with_same_position_events = (
+ session_event_counts.query("""distinct_events > 1""")
+ .reset_index()
+ .content_translation_session_id
+ .unique()
+ .tolist()
+)
+
+# sessions where multiple global edit count buckets were recorded
+sessions_with_multiple_edit_counts = (
+ all_events.groupby('content_translation_session_id')['user_global_edit_count_bucket']
+ .nunique()
+ .reset_index()
+ .query("""user_global_edit_count_bucket > 1""")
+ .content_translation_session_id
+ .unique()
+ .tolist()
+)
+
+# sessions with no dashboard open at start
+sessions_with_no_dopen_start = (
+ all_events.query("""(content_translation_session_position == 0) & (event_type != 'dashboard_open')""")
+ .content_translation_session_id
+ .unique()
+ .tolist()
+)
+
+sessions_with_dopen = (
+ all_events
+ .query("""event_type == 'dashboard_open'""")['content_translation_session_id']
+ .unique()
+ .tolist()
+)
+
+# sessions without dashboard open
+sessions_without_dopen = (
+ all_events
+ .query("""content_translation_session_id != @sessions_with_dopen""")['content_translation_session_id']
+ .unique()
+ .tolist()
+)
+
+# sessions with multiple events having same session position
+duplicate_events_with_same_position = (
+ all_events[['content_translation_session_id', 'content_translation_session_position', 'event_type']]
+ .value_counts()
+ .reset_index()
+ .rename({0: 'count'}, axis=1)
+ .query("""count > 1""")
+ .content_translation_session_id
+ .unique()
+ .tolist()
+)
n_all_sessions = all_events.content_translation_session_id.nunique()
+n_all_events = all_events.shape[0]
+
+n_valid_sessions = events.content_translation_session_id.nunique()
+n_events_from_valid_sessions = events.shape[0]
+
+pct_invalid_sessions =100-round(n_valid_sessions / n_all_sessions *100, 2)
+
+print(f'- all sessions: {n_all_sessions}; all events: {n_all_events}')
+print(f'- valid sessions: {n_valid_sessions}; events from valid sessions: {n_events_from_valid_sessions}')
+print(f'- percentage of sessions with potentially erroneous events: {pct_invalid_sessions}%')
+
+
+
- all sessions: 29365; all events: 277212
+- valid sessions: 15143; events from valid sessions: 87874
+- percentage of sessions with potentially erroneous events: 48.43%
+
+
+
+
+
+
Analysis: Entry Points & Sources
+
+
Dashboard Open
+
As the goal is to understand how users reach the translation dashboard, this part of the analysis only includes events where users navigate to main dashboard from an external source. For example, after adding a segement, a user can come back to dashboard for another translation, and these events are currently being recorded as direct acess (T353799), such dashboard_open events are not considered for this part.
# only display annonations if entry point accounts for more than 5%
+entry_by_edit_bucket['percent_annot'] = (
+ entry_by_edit_bucket['percent']
+ .apply(lambda x:f"{x:.0%}"if x >0.05elseNone)
+)
+
+# bar graph
+fig = px.bar(entry_by_edit_bucket,
+ x='percent',
+ y='edit_bucket',
+ color='source',
+ labels={
+'percent':'% of Total Events',
+'edit_bucket': 'Edit Bucket',
+'source': 'Entry Points'
+ },
+ color_discrete_sequence=px.colors.qualitative.T10,
+ title='Usage of Entry Points by User Global Edit Bucket',
+ text='percent_annot',
+# display in increasing edit bucket order
+ category_orders={
+'edit_bucket': edit_buckets,
+'source': entry_points_freq.entry_point.values.tolist()
+ }
+ )
+
+# relative stacks the bars
+fig.update_layout(barmode='relative', height=550, width=max_width)
+fig.update_xaxes(tickformat='.0%')
+fig = fig.update_traces(
+ textfont_color='white',
+ hovertemplate="<br>".join([
+"Edit Bucket: %{y}",
+"Percent of Total Events: %{x:.0%}"
+ ])
+)
+
+iplot(fig, config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary
+
+
+
+
+
85% of the newcomers opened the translation dashboard by navigating from frequent language selector, which surfaces missing languages to translate for an article.
+
As users gain more editing experience, more users tend to reach the dashboard increasingly through content language selector, which they can search for missing language to translate for an article.
+
Also, experienced users tend to open the dashboard directly as compared to newcomers.
+
For users with 1000+ edits, frequent languages selector is only 40% of the time to navigate to the dashboard.
# only display annonations if entry point accounts for more than 5%
+entry_by_target_wp_size['percent_annot'] = (
+ entry_by_target_wp_size['percent']
+ .apply(lambda x:f"{x:.0%}"if x >0.05elseNone)
+)
+
+# bar graph
+fig = px.bar(entry_by_target_wp_size.query("""target_wp_rank != '1-5'"""),
+ x='percent',
+ y='target_wp_rank',
+ color='source',
+ labels={
+'percent':'% of Total Events',
+'target_wp_rank': 'Target Language WP Size',
+'source': 'Entry Points'
+ },
+ color_discrete_sequence=px.colors.qualitative.T10,
+ title='Usage of Entry Points by Comparitive Wikipedia Size (of the Target Language)',
+ text='percent_annot',
+ category_orders={
+'target_wp_rank': [i for i in rank_bin_labels if i !='1-5'],
+'source': entry_points_freq.entry_point.values.tolist()
+ }
+ )
+
+
+# stack the bars
+fig.update_layout(barmode='relative', height=550, width=max_width)
+fig.update_xaxes(tickformat='.0%')
+fig.update_traces(textfont_color='white')
+
+iplot(fig, config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary
+
+
+
+
+
On larger Wikipedias, frequent languages selector was most used to navigate to the translation dashboard.
+
+
Among the top 20 Wikipedias, it was used 85% of the time to access the dashboard.
+
+
On smaller Wikipedias, although frequent language selector remains the most accessed, usage of content language selector is more compared to larger Wikipedia.
+
This is related to the observations from the user edit bucket, larger Wikipedias tend to have more newcomers compared to smaller Wikipedias.
+
+
+
+
+
+
+
Translation Start
+
+
The next step after opening the translation dashboard is the translation start page, which appears after a user confirms their choice of article/section to translate. This step occurs before the translation editing screen. In this section, various sources through which users reach the translation start page have been analyzed. This step can take place in two scenarios:
+
+
When users from an external source navigate to the dashboard (i.e. entry points such as frequent languages and content language selector), the opening of the translation dashboard is immediately followed by the translation start screen. In such cases, the event_source for dashboard_translation_start will be the same as the source for dashboard_open. For example, if a user clicks on a link from the frequent languages selector, dashboard_open and dashboard_translation_start events are consecutively triggered, with both having event source as frequent_languages. This is because the selection of the article/section has already happened.
+
When users reach the main dashboard either by directly opening, or returning after editing/completing a translation, there are multiple ways users are shown suggestions, and upon selection, sources specific to dashboard_translation_start get logged.
+
+
For this section, only events generated from the second scenario are considered, as the first scenario is caused due to the sources of dashboard_open.
Only 7% of the dashboard_translation_start occurred independently of dashboard_open events.
+
+
That indicates that most of the users start the translations by already selecting an article/section to translate from an external entry point.
+
+
Among the ones who initiate dashboard_translation_start independently
+
+
Majority of the newcomers start a translation by accepting suggestions by the API in the absence of a seed article.
+
Majority of the experienced users start a translation by choosing the results of a search, followed by accepting a translation suggested because it is related to one of their recent edits.
+
+
+
+
+
+
+
+
Analysis: User Flows (Funnel)
+
+
+
For the majority of the funnel analysis, we will be looking at three main event types, which account for more than 97% of the events:
+
+
dashboard_open: user opens the translation dashboard
+
dashboard_translation_start: proceeding from the dashboard to the start screen
+
editor_segment_add: user adds a segment of content to the translated version in the editor
+
+
While there are several other events instrumented (mostly related to how users interact with the suggestions), they account for less than 3% of the events, including them in the main analysis, adds a lot of noise, making it hard to derive insights. However, there will be a section at the end of to understand interactions with those events.
+
+
+
+
+Code
+
# main events list
+main_events = ['dashboard_open', 'dashboard_translation_start', 'editor_segment_add']
+
+# function to plot funnel of user flows
+# by default return a Plotly Sankey plot for a given a dataframe
+# https://plotly.github.io/plotly.py-docs/generated/plotly.graph_objects.Sankey.html
+# optional: add a table with distribution of edit buckets that tiggered the events
+# optional: return dataframe with transition data, instead of the plots
+def plot_funnel(df,
+ return_transition_data=False,
+ chart_title=None,
+ events_scope=main_events,
+ incl_session_end=True,
+ incl_edit_bucket_table=False,
+ font_size=12,
+ width=iplot_width,
+ height=iplot_width/2.25):
+
+ warnings.filterwarnings('ignore')
+
+ df = df.query("""event_type == @events_scope""")
+ df = df.sort_values(by=['content_translation_session_id', 'content_translation_session_position'])
+
+# next event in order within a session
+ df['next_event_type'] = df.groupby('content_translation_session_id')['event_type'].shift(-1)
+
+# consider as session ended if there no next event
+if incl_session_end:
+ df['next_event_type'].fillna('session end', inplace=True)
+else:
+ df.dropna(subset=['next_event_type'], inplace=True)
+
+ transition_counts = df.groupby(['event_type', 'next_event_type']).size().reset_index(name='count')
+ total_transitions_by_source = transition_counts.groupby('event_type')['count'].sum()
+ transition_counts['total_by_source'] = transition_counts['event_type'].map(total_transitions_by_source)
+ transition_counts['percentage'] = (transition_counts['count'] / transition_counts['total_by_source']) *100
+
+# subplots of table addition, if needed
+if incl_edit_bucket_table:
+ fig = sp.make_subplots(rows=1, cols=2, column_widths=[0.7, 0.3],
+ specs=[[{"type": "sankey"}, {"type": "table"}]])
+else:
+ fig = sp.make_subplots(rows=1, cols=1,
+ specs=[[{"type": "sankey"}]])
+
+
+if return_transition_data:
+return transition_counts
+else:
+ event_types = pd.concat([transition_counts['event_type'], transition_counts['next_event_type']]).unique()
+ all_event_types = pd.concat([transition_counts['event_type'], transition_counts['next_event_type']]).unique()
+ label_mapping = {label: i for i, label inenumerate(all_event_types)}
+
+ sources = transition_counts['event_type'].map(label_mapping)
+ targets = transition_counts['next_event_type'].map(label_mapping)
+ weights = transition_counts['count']
+
+ sankey = go.Sankey(
+ node=dict(
+ pad=15,
+ thickness=20,
+ line=dict(color="black", width=0.5),
+ label=[label if label !='session end'else'<i>session end</i>'for label in all_event_types]
+ ),
+ link=dict(
+ source=sources,
+ target=targets,
+ value=weights,
+ hovertemplate='Events: %{value}<br />'+
+'Percentage: %{customdata:.2f}%<extra></extra>',
+ customdata=transition_counts['percentage']
+ )
+ )
+
+ fig.add_trace(sankey, row=1, col=1)
+
+if incl_edit_bucket_table:
+ agg_events_by_bucket = (
+ df
+ .user_global_edit_count_bucket
+ .value_counts()
+ .reset_index()
+ .rename({
+'user_global_edit_count_bucket': 'Edit Bucket',
+'count': '# Events'
+ }, axis=1)
+ .sort_values('Edit Bucket')
+ )
+
+ agg_events_by_bucket['% of Events'] = (
+ agg_events_by_bucket['# Events'] / agg_events_by_bucket['# Events'].sum()
+ ).apply(lambda x:f"{x:.0%}")
+
+ table = go.Table(
+ columnwidth = [4, 3, 4],
+ header=dict(values=list(agg_events_by_bucket.columns),
+ align='left'),
+ cells=dict(values=[
+ agg_events_by_bucket['Edit Bucket'],
+ agg_events_by_bucket['# Events'],
+ agg_events_by_bucket['% of Events']],
+ align='left',
+ height=25)
+ )
+
+ fig.add_trace(table, row=1, col=2)
+
+ fig.update_layout(title_text=chart_title, font_size=font_size, height=height, width=width)
+return fig
+
+
+
+
+Code
+
iplot(
+ plot_funnel(
+ events,
+ chart_title='Flow of Users Through CX Workflows & Number of Events Generated by Edit Bucket',
+ incl_edit_bucket_table=True,
+ width=max_width,
+ height=max_width/2.25),
+ config=iplot_config
+)
In most cases (77%), those who opened the dashboard transitioned to the translation start screen.
+
+
This is because for users navigating to the dashboard from an external entry point, both events occur consecutively.
+
13% ended the session and 8% refreshed the dashboard or came back to it later before the session expired.
+
+
Among the users who proceeded to the start screen, only in 15% of the cases they progressed to the editor and made an edit.
+
+
In 46% of the cases, users went back to the main dashboard, and 30% ended the session.
+
As most of the events were generated by users with 0 edits (newcomers), this is largely influenced by those events.
+
+
Among users who made at least one edit, in 80% of the cases, they continued to make additional edits, while 9% went back to the main dashboard, and the rest ended the session.
+
+
+
+
+
By Edit Bucket
+
+
+Code
+
n_events = events.query("""(user_global_edit_count_bucket == '0 edits') & (event_type == @main_events)""").shape[0]
+iplot(
+ plot_funnel(events.query("""user_global_edit_count_bucket == '0 edits'"""),
+ chart_title=f'Flow of Users Through CX Workflows Having 0 Global Edits ({n_events} events)'),
+ config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary: Users with 0 Global Edits
+
+
+
+
+
+
Among the users who opened the dashboard:
+
+
in 80% of the cases, they proceeded to translation start screen.
+
in 12% of the cases, they ended the session.
+
in 8% of the cases, they refereshed the dashboard or came back to it later before the session expired.
+
+
Among the users who reach the translation start screen:
+
+
in 12% of the cases, they transitioned to the editor and made an edit.
+
in 42% of the cases, they went back to the main dashboard.
+
in 35% of the cases, they ended the session.
+
+
Among users who made at least one edit:
+
+
in 69% of the cases, they continued to make additional edits.
+
in 11% of the cases, they went back to the main dashboard.
+
in 11% of the cases, they ended the session.
+
+
+
+
+
+
+Code
+
n_events = events.query("""(user_global_edit_count_bucket == '1-4 edits') & (event_type == @main_events)""").shape[0]
+iplot(plot_funnel(events.query("""user_global_edit_count_bucket == '1-4 edits'"""),
+ chart_title=f'Flow of Users Through CX Workflows Having 1-4 Global Edits ({n_events} events)'),
+ config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary: Users with 1-4 Global Edits
+
+
+
+
+
+
Among the users who opened the dashboard:
+
+
in 82% of the cases, they proceeded to translation start screen.
+
in 12% of the cases, they ended the session.
+
in 5% of the cases, they refereshed the dashboard or came back to it later before the session expired.
+
+
Among the users who reach the translation start screen:
+
+
in 8% of the cases, they transitioned to the editor and made an edit.
+
in 60% of the cases, they went back to the main dashboard.
+
in 27% of the cases, they ended the session.
+
+
Among users who made at least one edit:
+
+
in 75% of the cases, they continued to make additional edits.
+
in 9% of the cases, they went back to the main dashboard.
+
in 8% of the cases, they ended the session.
+
+
+
+
+
+
+Code
+
n_events = events.query("""(user_global_edit_count_bucket == '5-99 edits') & (event_type == @main_events)""").shape[0]
+iplot(plot_funnel(events.query("""user_global_edit_count_bucket == '5-99 edits'"""),
+ chart_title=f'Flow of Users Through CX Workflows Having 5-99 Global Edits ({n_events} events)'),
+ config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary: Users with 5-99 Global Edits
+
+
+
+
+
+
Among the users who opened the dashboard:
+
+
in 77% of the cases, they proceeded to translation start screen.
+
in 14% of the cases, they ended the session.
+
in 9% of the cases, they refereshed the dashboard or came back to it later before the session expired.
+
+
Among the users who reach the translation start screen:
+
+
in 13% of the cases, they transitioned to the editor and made an edit.
+
in 64% of the cases, they went back to the main dashboard.
+
in 28% of the cases, they ended the session.
+
+
Among users who made at least one edit:
+
+
in 80% of the cases, they continued to make additional edits.
+
in 10% of the cases, they went back to the main dashboard.
+
in 6% of the cases, they ended the session.
+
+
+
+
+
+
+Code
+
n_events = events.query("""(user_global_edit_count_bucket == '100-999 edits') & (event_type == @main_events)""").shape[0]
+iplot(plot_funnel(events.query("""user_global_edit_count_bucket == '100-999 edits'"""),
+ chart_title=f'Flow of Users Through CX Workflows Having 100-999 Global Edits ({n_events} events)'),
+ config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary: Users with 100-999 Global Edits
+
+
+
+
+
+
Among the users who opened the dashboard:
+
+
in 71% of the cases, they proceeded to translation start screen.
+
in 18% of the cases, they ended the session.
+
in 11% of the cases, they refereshed the dashboard or came back to it later before the session expired.
+
+
Among the users who reach the translation start screen:
+
+
in 16% of the cases, they transitioned to the editor and made an edit.
+
in 49% of the cases, they went back to the main dashboard.
+
in 30% of the cases, they ended the session.
+
+
Among users who made at least one edit:
+
+
in 85% of the cases, they continued to make additional edits.
+
in 7% of the cases, they went back to the main dashboard.
+
in 5% of the cases, they ended the session.
+
+
+
+
+
+
+Code
+
n_events = events.query("""(user_global_edit_count_bucket == '1000+ edits') & (event_type == @main_events)""").shape[0]
+iplot(plot_funnel(events.query("""user_global_edit_count_bucket == '1000+ edits'"""),
+ chart_title=f'Flow of Users Through CX Workflows Having 1000+ Global Edits ({n_events} events)'),
+ config=iplot_config)
+
+
+
+
+
+
+
+
+
+
+
+
+Summary: Users with 1000+ Global Edits
+
+
+
+
+
+
Among the users who opened the dashboard:
+
+
in 72% of the cases, they proceeded to translation start screen.
+
in 17% of the cases, they ended the session.
+
in 10% of the cases, they refereshed the dashboard or came back to it later before the session expired.
+
+
Among the users who reach the translation start screen:
+
+
in 32% of the cases, they transitioned to the editor and made an edit.
+
in 40% of the cases, they went back to the main dashboard.
+
in 23% of the cases, they ended the session.
+
+
Among users who made at least one edit:
+
+
in 86% of the cases, they continued to make additional edits.
+
in 8% of the cases, they went back to the main dashboard.
Across all edit count buckets, most of the users (>70%) who opened the dashboard proceeded to the translation start screen.
+
+
The percentage is higher for newcomers compared to experienced users. This is because most newcomers reach the dashboard through external entry points rather than directly opening the dashboard, in which case, both dashboard_open and dashboard_translation_start are consecutively triggered (with no user action in between), whereas, among experienced users, more users open the dashboard directly and then click to proceed translation start screen.
+
+
Among users who reached the translation start screen
+
+
Newcomers tend to end/abandon the session or return to the main dashboard
+
Only in 12% of the cases, newcomers continued to make an edit from this stage, whereas users with 1000+ made an edit in 32% of the cases.
+
+
With higher the editing experience, the more likely that users will continue to make an edit
+
+
+
Among users who made at least one, with increasing editing experience, the more likely that users will continue to make additional edits to the machine-translated content, and less likely to end the session or return to the dashboard.
+
+
+
+
+
+
+
By Entry Point
+
+
+Code
+
dopen_sources = events.query("""event_type == 'dashboard_open'""").event_source.unique().tolist()
+
+# plot funnel for a given source
+# identifies sessions starting with the specificed source
+# uses the original plot_funnel functions
+# includes edit bucket table by default
+def plot_funnel_for_source(source, incl_edit_bucket_table=True):
+
+ sessions_with_source = (
+ events
+ .query(f"""(event_source == '{source}') & (event_type == 'dashboard_open') & (content_translation_session_position == 0)""")
+ .content_translation_session_id
+ .unique()
+ .tolist()
+ )
+
+ n_events = events.query("""(event_source == @source) & (event_type == @main_events)""").shape[0]
+
+ iplot(plot_funnel(events.query("""content_translation_session_id == @sessions_with_source"""),
+ chart_title=f'Flow of Users Through CX Workflows; Source: {source} ({n_events} events) & Number of Events Generated by Edit Bucket',
+ incl_edit_bucket_table=incl_edit_bucket_table, width=max_width, height=max_width/2.25),
+ config=iplot_config)
+
+
+
+
+Code
+
plot_funnel_for_source('frequent_languages')
+
+
+
+
+
+
+
+
+
+
+
+
+Summary: users navigated to the dashboard from frequent languages menu
+
+
+
+
+
+
Among the users who opened the dashboard:
+
+
in 82% of the cases, they proceeded to translation start screen.
+
in 11% of the cases, they ended the session.
+
in 6% of the cases, they refereshed the dashboard or came back to it later before the session expired.
+
+
Among the users who reach the translation start screen:
+
+
in 13% of the cases, they transitioned to the editor and made an edit.
+
in 46% of the cases, they went back to the main dashboard.
+
in 32% of the cases, they ended the session.
+
+
Among users who made at least one edit:
+
+
in 79% of the cases, they continued to make additional edits.
+
in 8% of the cases, they went back to the main dashboard.
The rate of transition between various stages of the funnel by the source of entry is highly correlated to the usage of the respective entry point by various user experience levels.
+
+
+
Among users who navigated through frequent languages menu, which was most frequently accessed by newcomers:
+
+
In 82% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 13% of the cases.
+
+
Among users who navigated through content language sector, which was frequently accessed by both newcomers and experienced users alike:
+
+
In 75% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 23% of the cases.
+
+
Among users who directly opened the dashboard, most frequently by experienced users:
+
+
In 36% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 34% of the cases.
+
+
Among users who navigated from an invitation shown on a non-existent page, which was frequently accessed by both newcomers and experienced users alike:
+
+
In 80% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 14% of the cases.
+
+
Among users who directly opened the dashboard with link to specific translation, most frequently by experienced users:
+
+
In 87% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 11% of the cases.
+
+
Among users who navigated from contributions page, which was frequently accessed by both newcomers and experienced users alike:
+
+
In 39% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 7% of the cases.
+
+
Among users who navigated from notice on recently translated articles to review/expand the translation, most frequently by experienced users:
+
+
In 88% of the cases, they proceeded to the translation start screen.
+
From the translation start screen, users made at least one edit in 31% of the cases.
+
+
+
+
+
+
+
+
+
User Flows: Other Events
+
+
+Code
+
# users flows and interactions with events apart from the main events (open, start, edit)
+other_event_transitions = (
+ plot_funnel(
+ events,
+ events_scope=events.event_type.unique().tolist(),
+ return_transition_data=True)
+ .query("""event_type != @main_events""")
+ .sort_values(['event_type', 'percentage'], ascending=[True, False])
+ .drop('total_by_source', axis=1)
+)
+
+
+other_event_transitions['next_event_type'] = (other_event_transitions['next_event_type']
+ .replace({i:f'➔ {i}'for i in events.event_type.unique().tolist()+['session end']}))
+other_event_transitions['event_type'] = (other_event_transitions['event_type']
+ .replace({i:f"""{i} ({events[events['event_type'] == i].shape[0]} events)"""for i in events.event_type.unique().tolist()}))
+
+other_event_transitions_tbl = (
+ gt
+ .GT(
+ other_event_transitions,
+ rowname_col='next_event_type',
+ groupname_col='event_type'
+ )
+ .fmt_percent('percentage', scale_values=False, decimals=1)
+ .cols_label(
+ count='# Events',
+ percentage='Percentage'
+ )
+ .tab_header('Transitions Between Other Event Types', 'apart from dashboard_open, dashboard_traslation_start, editor_segment_add')
+)
+
+other_event_transitions_tbl
+
+
+
+
+
+
+
+
Transitions Between Other Event Types
+
+
+
apart from dashboard_open, dashboard_traslation_start, editor_segment_add
+
+
+
# Events
+
Percentage
+
+
+
dashboard_discard_suggestion (677 events)
+
➔ dashboard_discard_suggestion
+
540
+
79.8%
+
+
+
➔ dashboard_translation_start
+
70
+
10.3%
+
+
+
➔ session end
+
34
+
5.0%
+
+
+
➔ dashboard_open
+
16
+
2.4%
+
+
+
➔ dashboard_search
+
6
+
0.9%
+
+
+
➔ dashboard_translation_continue
+
6
+
0.9%
+
+
+
➔ dashboard_refresh_suggestions
+
3
+
0.4%
+
+
+
➔ dashboard_translation_discard
+
1
+
0.1%
+
+
+
➔ editor_segment_add
+
1
+
0.1%
+
+
+
dashboard_refresh_suggestions (110 events)
+
➔ dashboard_refresh_suggestions
+
37
+
33.6%
+
+
+
➔ dashboard_translation_start
+
23
+
20.9%
+
+
+
➔ session end
+
18
+
16.4%
+
+
+
➔ dashboard_open
+
17
+
15.5%
+
+
+
➔ dashboard_search
+
7
+
6.4%
+
+
+
➔ dashboard_discard_suggestion
+
4
+
3.6%
+
+
+
➔ dashboard_translation_continue
+
4
+
3.6%
+
+
+
dashboard_search (958 events)
+
➔ dashboard_translation_start
+
791
+
82.6%
+
+
+
➔ dashboard_open
+
116
+
12.1%
+
+
+
➔ session end
+
33
+
3.4%
+
+
+
➔ dashboard_translation_continue
+
14
+
1.5%
+
+
+
➔ dashboard_search
+
3
+
0.3%
+
+
+
➔ editor_segment_add
+
1
+
0.1%
+
+
+
dashboard_translation_continue (440 events)
+
➔ dashboard_open
+
294
+
66.8%
+
+
+
➔ editor_segment_add
+
70
+
15.9%
+
+
+
➔ session end
+
64
+
14.5%
+
+
+
➔ dashboard_translation_continue
+
5
+
1.1%
+
+
+
➔ dashboard_translation_discard
+
3
+
0.7%
+
+
+
➔ dashboard_translation_start
+
3
+
0.7%
+
+
+
➔ dashboard_search
+
1
+
0.2%
+
+
+
dashboard_translation_discard (132 events)
+
➔ dashboard_translation_discard
+
76
+
57.6%
+
+
+
➔ dashboard_search
+
18
+
13.6%
+
+
+
➔ session end
+
14
+
10.6%
+
+
+
➔ dashboard_open
+
12
+
9.1%
+
+
+
➔ dashboard_translation_continue
+
8
+
6.1%
+
+
+
➔ dashboard_translation_start
+
3
+
2.3%
+
+
+
➔ dashboard_discard_suggestion
+
1
+
0.8%
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Summary
+
+
+
+
+
In cases where users discarded a suggested translation (677 occurrences), in 80% of the cases they continued to discard the next translation show as well, and 10% proceeded to the translation start screen.
+
In cases where users requested that the list of suggestions be regenerated (110 occurrences), in 33% of the cases they refreshed the suggestions again, and 20% proceeded to the translation start screen.
+
In cases where users initiated a search (958 occurrences), in 82% of the cases they proceeded to the translation start screen, and 12% returned to the dashboard.
+
In cases where users selected an in-progress translation (440 occurrences), in 67% of the cases they returned to the dashboard, and 16% made an edit to the translation.
+
In cases where users discarded an in-progress translation (132 occurrences), in 58% of the cases they discarded additional in-progress translations, and 13% initiated a search.