2019/Grants/Sourceror: The Wikipedia community's platform against disinformation

From WikiConference North America
< 2019‎ | Grants
Jump to navigation Jump to search


Title:

Sourceror: The Wikipedia community's platform against disinformation

Name:

Newslinger

Wikimedia username:

Newslinger

E-mail address:

newslinger@sourceror.org

Resume:

Over the past 20 months, I have dedicated substantial effort to maintaining the perennial sources list, an index of commonly discussed sources on the English Wikipedia classified by reliability and accompanied with summaries of related noticeboard discussions. Thousands of editors refer to the list every month to determine whether a source is credible enough to support claims in Wikipedia articles. The list (initially created by MrX) contains contributions from 123 editors and incorporates source evaluations from over a thousand editors submitted in the past 13 years. It was viewed more than 36,000 times in the last 30 days.

Each entry on the perennial sources list contains these key pieces of information:

  • Name
  • Wikipedia article
  • Aliases
  • Reliability classification
  • Links to noticeboard discussions
  • Year of most recent discussion
  • Summary of discussions
  • Web domains

Geographical impact:

Global

Type of project:

Technology

What is your idea?

Sourceror is a technology platform that uses the data in the perennial sources list to help editors combat misinformation and disinformation in Wikipedia articles. Sourceror also aims to increase media literacy among Internet users in general.

This proposal consists of three projects. Project 1 is a prerequisite for Projects 2 and 3.

Project 1: Sourceror Bot and Sourceror API

The perennial sources list contains 268 entries displayed in a wikitext table. Although these entries are generally formatted in a certain way, they are not in a machine-readable format that client applications can readily use without manual processing.

Project 1 normalizes the data in the list and provides a query API so that developers can make use of the list without needing to parse the data themselves. This project implements the following objectives:

  • At a recurring interval, the Sourceror Bot scrapes the perennial sources list, parses all of the information within, and records the changes into a database.
    • The Bot may eventually be extended to track additional data about the sources (e.g. number of Wikipedia pages that link to the citation).
  • The Sourceror API accepts data queries from client applications and provides responses in the machine-readable JSON format.

Applications that would be able to make use of the Sourceror API include Cite Unseen (a user script by SuperHamster and Sky Harbor), unreliable.js (a user script by Headbomb), Projects 2 and 3 (detailed below), and any new applications by developers both inside and outside of the Wikimedia community.

Project 2: Sourceror Web App

Large wikitext tables on Wikipedia, including the one used in the perennial sources list, suffer from reduced accessibility (especially on mobile devices). The list was only designed with desktop/laptop computer screens in mind, and requires sufficient screen width to use comfortably.

At over 255,000 characters of template-heavy wikitext, the list is cumbersome to maintain and limited to its current level of detail. Editors have asked for additional information (e.g. country, language, and Alexa rank) to be presented in the entries, but we were unable to implement any of these suggestions without overcrowding the chart in its current form.

The chart lacks search and filtering features, as Wikipedia pages are generally not allowed to include scripts for dynamic web functionality. The complexity of the chart also hinders editor participation, as the formatting can be a bit intimidating.

Project 2 uses the Sourceror API to form a responsive single-page application that displays the data from the perennial sources list in a format that is more accessible for mobile (and also desktop/laptop) devices. This project implements the following objectives:

  • The Sourceror Web App displays all of the information in the perennial sources list in an accessible interface that eliminates horizontal scrolling for devices with smaller screens.
  • Users can search for the source they are looking for without having to scroll through the list. Users can also filter the list by specific attributes (e.g. reliability classification).
  • The App loads and displays related information (e.g. country, language, and Alexa rank) about the sources from Wikidata alongside the corresponding entries when the user is online.
  • The App includes an entry editor that allows users to create new entries and revise existing entries in the perennial sources list without needing to understand template syntax or work with a large wikitext document.
  • The App is a progressive web application that works offline. After the user opens the App in their mobile web browser, they have the option to install the App to their mobile device's home screen. Once downloaded, the App displays cached information when the user is offline and retrieves updates when the user is online.
    • The home screen feature works for both Android and iOS. Desktop/laptop computers can also make use of the offline functionality.
    • The App is available for Android devices in the Google Play Store (which allows listings for progressive web apps).

The final name of the Sourceror Web App is to be determined. Ideally, the app would be named something along the lines of "Wikipedia Source Guide", but such a name would require permission from the Wikimedia Foundation.

Project 3: Sourceror Browser Extension and Sourceror User Script

On the English Wikipedia, there are currently over 180,000 articles that are tagged as lacking citations, and over 387,000 claims that are tagged as "citation needed". These issues are only resolved when editors add reliable sources or remove claims that are unsupported by reliable sources.

Project 3 introduces features to make it easier to identify and properly handle sources indexed in the perennial sources list. This project implements the following objectives:

  • The Sourceror Browser Extension displays an icon on the browser toolbar that corresponds to the reliability classification of the current page, if the website is indexed in the perennial sources list.
  • Users can click on the Extension's icon to display the information from the current website's entry in the perennial sources list.
  • For all links on the current page to a website indexed in the list, the Extension visually indicates (e.g. with a icon or color highlight) the reliability classification of the website. This feature is optional, and can be disabled by the user.
  • The Extension contains a citation generator that produces a properly formatted Cite web template for the current website, if it is on the perennial sources list.
    • This feature may eventually be expanded to cover websites that are not on the list, pending data contributions from the community.
  • The Extension is a WebExtension that works on Mozilla Firefox (including Firefox for Android), Microsoft Edge, Google Chrome, and Chromium-based browsers.
  • The Sourceror User Script allows users to rapidly remove citations of unreliable sources in a couple of clicks. For each selected citation, users can choose to replace the citation with a more reliable source, replace the citation with a "citation needed" tag, or delete the information supported by the citation.

The Sourceror Browser Extension and Sourceror User Script are designed to complement existing initiatives, including Cite Unseen, unreliable.js, and Citation Hunt.

Additionally, the Extension is intended to bring the Wikipedia community's source evaluations to a broader audience. The success of source-rating projects including NewsGuard, Ad Fontes Media, and Media Bias/Fact Check indicates that there is popular demand for resources that evaluate website credibility. Wikipedia is in an excellent position to provide this resource to the public, as editors already use it internally.

Why is it important?

Wikimedia's Strategy 2030 report identified misinformation and disinformation as threats to the Wikimedia movement's goal of making free knowledge available to all. Specifically, "Wikimedia projects are vulnerable to government, political, cultural, or profit-driven censorship and misinformation campaigns, as well as outright falsified content". Sourceror counters misinformation and disinformation by using the perennial sources list to help Internet users identify whether sources are reliable. Sourceror also provides tools to Wikipedia users that enable us to add reliable sources and remove unreliably sourced information from articles on Wikipedia.

Finally, the data provided by the Sourceror API makes credibility ratings of popular websites freely accessible to all developers in a machine-readable format under Wikipedia's Creative Commons license. This opens up new opportunities for projects outside of the Wikimedia movement to vet the credibility of online content.

Is your project already in progress?

No. A proposed JSON schema of the perennial sources list is available in this discussion.

How is it relevant to credibility and Wikipedia? (max 500 words)

The Sourceror Web App and Sourceror Browser Extension help both Wikipedians and non-Wikipedians determine whether the content they are reading is credible. Wikipedians can use the Extension and the Sourceror User Script to improve the credibility of the information cited in Wikipedia articles. Developers can use the Sourceror API to provide credibility evaluations for external services.

What is the ultimate impact of this project?

Sourceror benefits three key demographics:

  • Wikipedia editors: The thousands of editors who reference the perennial sources list for credibility data each month gain a mobile-friendly interface, which becomes more critical as an increasing majority of the world relies on smartphones for Internet access. Editors also gain access to a citation generator and a citation remover that accelerates adding reliable sources to and removing unreliable sources from Wikipedia articles. Wikipedia's articles would benefit as editors gain the ability to combat misinformation and disinformation more effectively.
  • Internet users: Media consumers gain new ways to access the repository of reliability data that is used to keep Wikipedia articles credible. A web app and browser extension allow Internet users to conveniently reference the Wikipedia community's evaluation of the credibility of websites they visit. These Internet users would improve their media literacy as they are armed with a reference distilled from 13 years of Wikipedia discussions on reliability.
  • Developers: An API that provides data under Wikipedia's Creative Commons license gives developers the ability to incorporate the Wikipedia community's reliability evaluations in their own projects. As developers are able access to this data for free, they would use this data to build technologies that take the credibility of online media into account.

Could it scale?

Yes, Sourceror would be able to handle a very large number of Wikipedia editors, Internet users, and developers. The API, Web App, and Browser Extension are able to scale to the extent allowed by Toolforge and Cloud VPS. The User Script is limited only by Wikipedia's server capacity.

Why are you the people to do it?

As the top contributor to the perennial sources list, I am very familiar with the data to be used for Sourceror. My work in this area has been a labor of love. I will deliver each approved project even if the required development time exceeds the estimates. Instead of hiring expensive contractors, I am able to develop Sourceror on my own, which ensures that the projects will meet the expectations I have outlined in the proposal.

What is the impact of your idea on diversity and inclusiveness of the Wikimedia movement?

Sourceror improves the diversity and inclusiveness of the Wikimedia movement in several ways:

  • Mobile device users, which are underrepresented on Wikipedia, gain an accessible way to view the perennial sources list.
  • New Wikipedia editors who are unfamiliar with Wikipedia's guidelines on reliability gain resources to help them identify which sources are appropriate to use in articles. This improves editor retention by reducing the chance that they are confronted with negative feedback for their good-faith contributions.
  • The Browser Extension and Web App, which also targets Internet users who are not part of the Wikimedia movement, would spark their interest in the movement. These projects will also contain messaging that encourages non-Wikimedians to join the movement. Editor recruitment improves the diversity of the Wikimedia movement.

What are the challenges associated with this project and how you will overcome them?

While I have experience with web development on other platforms, I have not yet developed a project on Wikitech's Toolforge or Cloud VPS platform. As documentation for these platforms is readily available, I am in the process of studying the material and will be prepared to build Sourceror if approved. If I encounter trouble with the platforms, I intend to use the provided support channels.

How much money are you requesting?

Between $4,000 and $10,000

How will you spend the money?

Each deliverable is assigned a monetary estimate of $2,000. Project 1 is a prerequisite for Projects 2 and 3.

  • Project 1 (Sourceror Bot and Sourceror API): $4,000
  • Project 2 (Sourceror Web App): $2,000
  • Project 3 (Sourceror Browser Extension and Sourceror User Script): $4,000

How long will your project take?

5 months in total.

Each deliverable is assigned a time estimate of one month. Project 1 is a prerequisite for Projects 2 and 3.

  • Project 1 (Sourceror Bot and Sourceror API): 2 months
  • Project 2 (Sourceror Web App): 1 month
  • Project 3 (Sourceror Browser Extension and Sourceror User Script): 2 months

Have you worked on projects for previous grants before?

No.