2019/Grants/COVID-19 translation

From WikiConference North America
< 2019‎ | Grants
Jump to navigation Jump to search


COVID-19 translation


Lane Rasberry
This is a proposal based in the University of Virginia School of Data Science

Wikimedia username:


E-mail address:



Geographical impact:


Type of project:

Research + Output

What is your idea?

apply existing Wikipedia workflows to translate COVID-19 content from English into underserved languages: Tibetan and Nepali

Why is it important?

The necessary response to COVID-19 includes making information available to everyone who wants it. For language communities which have not already gotten translations of key information, this project will organize translation from English and deliver it to the language communities which need it through their language version of Wikipedia. Also, by performing this translation for COVID-19, we establish a precedent and create a model to replicate so that we can better translate information into these language communities for the next crisis.

The Wikipedia community historically has operated at the pace of available volunteer labor. What is different now is that in the context of the COVID-19 pandemic, everyone needs immediate access to certain high priority health messages. Some languages have communities which are translating this content, but other languages do not have capacity. If we translate English Wikipedia COVID-19 articles into other languages which do not have this content available in their language, then Internet users will find and access it immediately.

Is your project already in progress?

No, this project is not in progress, as no one is currently doing translation of COVID-19 content into the languages this project sponsors. Yes, the University of Virginia has ongoing similar Wikipedia health translation projects, but only for certain languages, and not for the languages of this project. Yes, there are various other wiki projects which are actively engaged in interconnected medical translation projects, but none of them are for basic COVID-19 content into the target languages of this project.

This project proposes to organize the translation of English Wikipedia COVID-19 content into languages which do not otherwise have this content, then publish it into the Wikipedia of that respective language. This project will target languages and COVID-19 content when there is no evidence of anyone producing that content in certain languages. In such cases, it is unlikely that communities of those languages will get any content in Wikipedia for COVID-19, and based on precedent, will be unlikely to get content from any source.

We have considered which languages to target. Based on opportunity for partnership with university researchers, and on local expatriate native language speakers in the town of the university, and based on lack of Wikipedia content development on COVID-19, and based on feedback from native language collaborators in the language homeland, we are interested in translating Tibetan language and Nepali language. The University of Virginia has a large Tibetan studies program and there is a relatively large community using Tibetan as a living language here. Tibetan is a language with a devoted following to preserve into next generation media even while it is underdeveloped in its Internet presence, and even while its speakers frequently use other languages also. Nepali is the national language of Nepal and the native language of 16 million people there. After the April 2015 earthquake there was some international organization to prepare for future crisis in Nepal, but now after COVID-19, we can see that deploying native language content is still a challenge as there is little information outside Wikipedia and critical COVID-19 articles missing in Wikipedia. Nepali and Tibetan are both languages spoken in the same region, and by developing both of these, there is regional synergy in outreach and disaster preparation.

At the University of Virginia, some faculty, staff, and students are contributing to a related project, SWASTHA, which seeks to translate medical information into languages of South Asia. Because of this existing engagement, the University of Virginia is already prepared to organize translation of other health information for COVID-19 into various languages. The SWASTHA project has no sponsorship or current partner interest in Tibetan or Nepali language content, although in the course of this project, we have gotten requests from local language communities to include these and other minority languages. We hope that this proposal can be a pilot which leads to anyone doing more projects for small language Wikipedia communities.

It is a wiki custom to invite participation and share credit for outcomes to related projects which contribute to the shared environment. This project will collaborate with and credit the following projects in the course of its development. This is a fairly rushed proposal, and these organizations or projects have not endorsed this proposal specifically. However, all of these contribute resources into the public commons which this proposal uses. This project welcomes collaboration with any of these communities, and would invite them in collaboration if this project proceeded. Right now, the organizers of this proposal are talking with wiki people known to be engaged in medical translation to coordinate a common plan with shared goals.

  • Thanks to these organizations. They and their individual participants are welcome to collaborate and share credit in this project.
  • Thanks to these organizations for contributing to the shared knowledge and Wikipedia infrastructure on which this project relies.

An explanation of the activities of each of these organizations is as follows:

  1. Projects with subject matter expertise
    1. Wikipedia:WikiProject Medicine - English Wikipedia's hub for managing medical content
    2. WikiProject COVID-19 - specialized project for COVID-19 content, founded with tie to WikiProject Medicine
  2. Projects which administer translation
    1. Wikipedia Translation task force - presents model for translation of medical content from English to other languages
    2. SWASTHA - local version of the translation task force targeted to languages of India
    3. Translators without Borders - not wiki editors, but do translation of documents for wiki editors
  3. Projects which build conceptual models for crisis response
    1. WikiProject Humanitarian Wikidata
    2. WikiProject Disaster management
    3. Wikimedians for Sustainable Development
  4. Projects which develop quality control best practices
    1. Cochrane - provides access to relevant medical publications and guidance using them
    2. Wikimedians in Residence Exchange Network - presents models for tracking and reporting impact
    3. Wiki Education Foundation - presents models for student engagement at universities
    4. Wikimedia New York City - organizes discussion, review, and translation from the in-person Wikimedia community in and around NYC
    5. Wikimedia Sweden - organizes WikiGap, a Wikipedia translation program and model for this program

The model of translating COVID-19 content and distributing it into Wikipedia is in progress. Many languages, though, are underserved and not getting sufficient amounts of original content or translations, simply because their Wikipedia communities are not developed enough to mobilize quickly or to process medical and other COVID-19 information at the necessary scale.

How is it relevant to credibility and Wikipedia? (max 500 words)

Misinformation about COVID-19 is in global circulation. As many people, including journalists and policymakers, use Wikipedia as part of their fact checking workflows, publishing information in Wikipedia helps to correct such misinformation. Many cultures and languages have their own local stories. An example of good information to share is the COVID-19 message of the World Health Organization, and an example of misinformation in this space is any health information directly contrary to the WHO.

Sources of misinformation about COVID-19 include top ranking political leaders in various countries, influential promoters of contraindicated alternative medicine, and miscellaneous people everywhere publishing non-medical fantasy health strategies in the chaos.

Wikipedia is a transparent source of objective assertions where people can find summaries and citations of reliable sources and also have an opportunity to speak out and be heard in a public forum, even if they want to document their opposition in Wikipedia article talk pages.

What is the ultimate impact of this project?

This project will provide COVID-19-related information to language communities which we reasonably expect will not get access to this information otherwise at the necessary quantity, quality and speed. We will also document this project as a model of crisis response in Wikipedia in order to help prepare for future similar situations and also to promote the ideas of community collaboration, respect for the citations of reliable published authoritative sources, and transparency in the publishing process.

Specific deliverables for this project are as follows:

  1. Translation of COVID-19 content
    1. This project will start with these English Wikipedia articles
      1. Coronavirus disease 2019, the disease
      2. 2019–20 coronavirus pandemic, the virus
      3. Severe acute respiratory syndrome coronavirus 2, the outbreak
    2. and translate them into
      1. Tibetan
      2. Nepali
  2. Documenting this process for others to replicate
    1. Take notes
      1. Organize translation
      2. Publish in Wikipedia
      3. Measure audience reaction
      4. Thank collaborators
    2. Publish a paper describing methodology and impact

Could it scale?

Yes. The part of this project which scales is using money to fund staff to speed the translation of well developed Wikipedia content from one language to another.

Wikipedia has many bottlenecks which typically are difficult to fix with money. One of the few bottlenecks in Wikipedia which sponsorship can resolve is content language translation.

Why are you the people to do it?

At the University of Virginia, we already have a Wikipedia program that has been ongoing since 2018. We can do this program in the model of university participation in Wikipedia, which is already established as a type of collaboration that the Wikimedia community wants and that already has a proven record of achieving results. At a university, students can take roles in projects by contributing and also discussing the ethics and intent of it.

Our university already does projects in medicine in Wikipedia.

I as project organizer, Lane Rasberry, am a coordinator for the Wikimedians in Residence Exchange Network where I report my activities to other staff Wikimedians at organizations to share them as part of the joint development of best practices. In this organization, people in similar roles all contribute their projects to establish a larger precedent.

What is the impact of your idea on diversity and inclusiveness of the Wikimedia movement?

This project promotes diversity and inclusiveness by publishing and delivering information about COVID-19 which certain language communities need in the present crisis.

A broader challenge which this project raises is that the Wikimedia community does not already have in place a disaster management plan for deciding what information is necessary to deploy and directing resources to meet some minimal information sharing with every language community which requires it. Having an American university fill this gap is an alternative to the ideal solution, which would be somehow supporting the local community of direct users in developing this content for themselves.

What are the challenges associated with this project and how you will overcome them?

The obvious big challenge is that this project is based in the United States but benefits a community of stakeholders in a foreign language community.

Ideally, a language project should have its own language community lead and manage it. As circumstances are, administering this project as quickly as necessary using only native speakers in their own native communities seems unprecedented. The English speaking world has the best access to COVID-19 information, the best access to Wikipedia workflows and documentation for disaster response, and the best access to communities who will participate in planning discussions across cultures. An English language team can be reliable for selecting content and sending it for translation through the established Wikipedia process.

The reason why this project should proceed despite the shortcoming of not being local is because every language community needs content quickly and the infrastructure for developing and publishing this content is not already in existence. This project will translate the content now, document what this university did, and raise awareness that the world collectively needs better infrastructure for translating and deploying information in crisis situations.

How much money are you requesting?


How will you spend the money?


  • 40% University of Virginia research staff
    • expertise in language and culture
    • write paper modeling this translation project for crisis response in Wikipedia
  • 40% native speakers to submit and review translations
  • 20% student researcher
    • writing and documentation
    • performing wiki edits

How long will your project take?

6 months

0 month - start
1 month - adaption of English language content for culture of target languages
2 month - translation and publication of the first draft of content
4 month - publication of structured dataset of translated technical terms in Wikidata, which enables tracking progress of content development and also is a check on quality control
6 month - academic paper (preprint, will consider publishing) - what we did, how we checked it, report of metrics, how anyone can replicate it

Have you worked on projects for previous grants before?