2019/Grants/Feedback.news

From WikiConference North America
< 2019‎ | Grants
Revision as of 16:52, 6 April 2020 by Emmanuel (talk | contribs) (Created page with "{{WCNA 2019 Grant Submission |name=Emmanuel Vincent |username=emvincent |email=emvincent{{@}}sciencefeedback.co |resume=https://github.com/science-feedback/science-feedback-ma...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Title:

Feedback.news

Name:

Emmanuel Vincent

Wikimedia username:

emvincent

E-mail address:

emvincent@sciencefeedback.co

Resume:

https://github.com/science-feedback/science-feedback-main (This is the repository of the app we will use and expand on in this project)

Geographical impact:

Global. Lots of Fact-checks are in English, French or Spanish (covering the US, UK, Canada, Australia, France, Spain, Latin America, French and English speaking countries in Africa…)

Type of project:

Technology

What is your idea?

We are working on creating an open-source database of misinformation sources, starting with the topic of misinformation about COVID-19. Our idea is to use wikidata to populate our database of information sources (outlets, newspapers, authors, journalists…) and to enrich wikidata with the data we are collecting about whether these sources have published misinformation. This should be useful for the Wikipedia community to objectively identify unreliable sources as well as other web platforms.

In the feedback.news project, we are building on the database of thousands of fact-checks on Coronavirus gathered by the International Fact-Checking Network (IFCN #CoronaVirusFacts Alliance: https://www.poynter.org/ifcn-covid-19-misinformation/page/). We are researching all the websites that have published one of the claims debunked by fact-checkers as well as all the influential social media accounts that have shared these articles, videos or social media posts.

This will be accomplished using a combination of automated methods and crowd-sourcing following these steps: - For each claim reviewed as False or Misleading, we are performing an automatic search to find all the instances repeating the false claims online (websites, social media posts, videos). - For each of the posts we find with the above method, we are asking two persons on our platform to confirm whether the post i) is unrelated to the claim, ii) repeats and endorses the false claim or iii) if they are actually debunking it or expressing doubts about the claim, to avoid classifying them as promoters of false information. We test the ability of each contributor to perform the task on a set of known responses.

The result will be an open-source database that will complement the IFCN #CoronaVirusFacts Alliance database with a list of urls and social media posts that have shared misinformation and their source (domains, Facebook pages/groups, Twitter accounts, Youtube channels).

We finally would like to complement our database with wikidata and propose a “statement” of the number of times a publisher (outlet or author) has published misleading or false information according to fact-checks. So each wikidata entry for which we have information could have a “statement” with property “number of failed fact-checks” for instance.

Why is it important?

This is important because Wikipedia and web platforms such as search engines and social media platforms need objective ways to determine which sources are reliable. Fact-checkers create a huge amount of information about claims that are false and often repeated across a wide spectrum of “news” sources, and that could be harvested to build a track record of documented media failures.

During this pandemic, the need for accurate information and the ability to identify sources of information one can trust is more pressing than ever to allow citizens to make informed decisions about their health and democracy.

Is your project already in progress?

Yes, we are currently in the set up phase of our new platform. We have explored our concept with a small dataset including a dozen fact-checks from Science Feedback and a few hundred websites and social media accounts that have propagated the false claims. See this post for more information on this: https://sciencefeedback.co/building-an-open-source-database-of-misinformation-sources-on-covid-19/

We have now collected more than 2,000 fact-checks to expand on the dozen mentioned above and are preparing the search phase in collaboration with volunteers from Microsoft and the classification phase thanks to a flash grant by the Google News Initiative.

How is it relevant to credibility and Wikipedia? (max 500 words)

Collecting all the instances of failed fact-checks by sources is one of the most objective ways of building their credibility profile. Doing so will enable wikipedians to identify sources of misinformation and help ground discussions about sources’ reliability in readily available data.

What is the ultimate impact of this project?

The impact we hope this project will have is to empower people to collaboratively build an open database of sources’ reliability. Of course, the criteria to take into account will go beyond failed fact-checks to include other indicators of professional journalism practices, but fact-checks are a good way to start in keeping with Wikipedia’s spirit of relying on secondary sources. Such a database would allow web platforms and even web browsers to take reliability indicators into account and help their users access reliable information or be informed when a page they are visiting has been seriously challenged.

Could it scale?

Yes, we think so. The amount of data generated by fact-checkers has increased dramatically over the past few years and is expected to keep growing in the foreseeable future. Building a platform with clear practices for contributors and generating useful data for partners should motivate people who are concerned about the importance of online misinformation to contribute.

Why are you the people to do it?

We are a group of scientists, fact-checkers and web developers committed to improving the credibility of information on the Internet. We are both motivated and skilled to lead this solution to completion. We have received support from the Google News Initiative via a flash grant and from Microsoft whose employees in Paris, France want to contribute to a project on COVID-19 misinformation during the confinement period. We have chosen an open approach while many players who wish to accomplish something similar are doing so by building their own closed database often in a private company. We do believe such an open approach is the way to go for such global problems.

What is the impact of your idea on diversity and inclusiveness of the Wikimedia movement?

Honestly, we do not know. But we hope this project will bring together a collective of individuals who wish to work collaboratively for the common good, and hopefully some of these contributors, be they fact-checkers, scientists or developers, will become familiar with wikidata and provide further help in its development.

What are the challenges associated with this project and how you will overcome them?

One of the largest challenges in any crowd-sourced project is to create enough momentum so that a community of contributors emerges and sustainably contributes. To overcome this challenge, we will: - Start by showing some applications of what people are contributing to; e.g. by creating data visualizations highlighting the networks of accounts propagating misinformation on social media; - Use some of our budget to pay contributors who are the first to contribute and avoid a blank page syndrome for new contributors. Another challenge is to coordinate the work of dozens of volunteers who have expressed interest in helping with the development of our platform. For this we have hired a Product Owner who will be tasked with organizing the work of all volunteers, making sure everybody is contributing to the overall project.

How much money are you requesting?

$8,000 - $10,000

How will you spend the money?

We will pay a Product Owner and a Data Engineer (already identified) half-time over a 2 months period. The Product Owner will be managing the contributions from the backend, frontend, data visualizations developers. The Data Engineer will contribute to building our database infrastructure and plug relevant APIs to our platform including the interaction with wikidata.

How long will your project take?

2 months

Have you worked on projects for previous grants before?

Not for wikimedia grants