Evaluating Success for Content Translation


This research project explored patterns of usage of the Content Translation [1] tool by editors from English to French Wikipedia. Literature review: the goal of this first stage was to simply gain a more well-rounded understanding of prior research conducted about the Wikipedia community and machine learning techniques on language analysis. Understanding the data (qualitative analysis): the goal of this second phase was to be able to articulate more concise questions about the translation habits of editors and thus be able to discover a clearer direction for this research project. The main questions that were asked: are certain topics chosen more frequently? Are there editing patterns, for example did editors prefer to add the main content in the first few edits or did they prefer to add information slowly in later edits? Quantitative analysis of sections: finally for the last stage, we wrote a section title comparison function that analyzed how different sections were translated from English to French. I also continued to record notes and observations about translations. Some results I found are: It is not necessarily true that an article more related to French culture will have a better translation effort. Content and quality depends on editors’ interest. “Batch” articles were common; 3-6 articles of similar content translated by the same editor in a short span of time. Most translated articles share the same sections and order as their English source article.

This project was undertaken for the 2019 Summer Outreachy internship program [2]. More information can be found on the Meta page [3].

Doris Zhou

McGill University

20 minutes

Normal-sized presentation

