Difference between revisions of "2019/Grants/Explicit credibility signal data on wikipedia"

From WikiConference North America
< 2019‎ | Grants
Jump to navigation Jump to search
(cut a few unimportant paragraphs, tweak a few phrases)
m (clarifies scaling a bit)
Line 19: Line 19:
 
* Because of its great expertise in these fields, the Wikipedia community is an excellent proving ground for these technologies. Flaws in the technologies that might eventually lead to failure in the broader media ecosystem are likely to be spotted very quickly by wikipedians, giving time to improve the designs before wider deployment.
 
* Because of its great expertise in these fields, the Wikipedia community is an excellent proving ground for these technologies. Flaws in the technologies that might eventually lead to failure in the broader media ecosystem are likely to be spotted very quickly by wikipedians, giving time to improve the designs before wider deployment.
 
|impact=If successful, this project will show a clear way that people can collaborate online in protecting themselves and their communities from misinformation. This method can be adopted by communities and platforms around the world to greatly reduce misinformation and other online harms.
 
|impact=If successful, this project will show a clear way that people can collaborate online in protecting themselves and their communities from misinformation. This method can be adopted by communities and platforms around the world to greatly reduce misinformation and other online harms.
|scalability=Yes, this plan is phenomenally scalable.
+
|scalability=Yes, this plan is phenomenally scalable. If it becomes fully established as a decentralized ecosystem, as designed, it will operate and grow with zero effort or support from us or Wikimedia.
   
 
It is based on existing social practices, where each individual manages their own credibility assessment process (deciding what to believe), using what they can glean from their surroundings, including their social network. This process scales linearly with the number of individuals, with each individual deciding how much of their own resources to devote to each assessment they make. Adding computers and networking to this existing human process should greatly improve the efficiency and accuracy of this process, without altering this scaling behavior.
 
It is based on existing social practices, where each individual manages their own credibility assessment process (deciding what to believe), using what they can glean from their surroundings, including their social network. This process scales linearly with the number of individuals, with each individual deciding how much of their own resources to devote to each assessment they make. Adding computers and networking to this existing human process should greatly improve the efficiency and accuracy of this process, without altering this scaling behavior.

Revision as of 03:35, 5 April 2020


Title:

Explicit credibility signal data on wikipedia

Name:

Sandro Hawke

Wikimedia username:

Sandro_Hawke

E-mail address:

sandro@w3.org

Resume:

Geographical impact:

global

Type of project:

Technology

What is your idea?

See https://docs.google.com/document/d/1kdwuzWqnh3-As3Uyiyoo2Uk7AnzmXmVTZARBOwpY4gY/edit

Why is it important?

For Wikipedia, this idea promises to help in the fight against misinformation, making it easier for wikipedians and the broader world to collaborate in identifying credible and non-credible sources.

For the world at large, the stakes are much higher, as this approach has the potential to turn the tide against misinformation across all technology platforms.

Is your project already in progress?

We are developing the relevant concepts and tools (as seen at https://credweb.org) but have not begun deployment in the wikipedia community or tooling to work with wikipedia data feeds.

How is it relevant to credibility and Wikipedia? (max 500 words)

There are many connections between this Credibility Signals work and Wikipedia:

  • Wikipedia has always needed to be able to separate fact from fiction. While it does this very well, these tools might make the task easier. Specifically, this can rapidly highlight which sources have unacceptably low credibility and help with sorting out why particularly sources are viewed as credible or not credible.
  • Wikipedia has always needed to reduce harm done by careless and malicious users. It does this very well, but again, these tools might make the task easier, assisting in tracking and management of the reputation of users, which can be used in modifying their privileges.
  • Because of its great expertise in these fields, the Wikipedia community is an excellent proving ground for these technologies. Flaws in the technologies that might eventually lead to failure in the broader media ecosystem are likely to be spotted very quickly by wikipedians, giving time to improve the designs before wider deployment.

What is the ultimate impact of this project?

If successful, this project will show a clear way that people can collaborate online in protecting themselves and their communities from misinformation. This method can be adopted by communities and platforms around the world to greatly reduce misinformation and other online harms.

Could it scale?

Yes, this plan is phenomenally scalable. If it becomes fully established as a decentralized ecosystem, as designed, it will operate and grow with zero effort or support from us or Wikimedia.

It is based on existing social practices, where each individual manages their own credibility assessment process (deciding what to believe), using what they can glean from their surroundings, including their social network. This process scales linearly with the number of individuals, with each individual deciding how much of their own resources to devote to each assessment they make. Adding computers and networking to this existing human process should greatly improve the efficiency and accuracy of this process, without altering this scaling behavior.

In its approach to decentralization, this design avoids any central bottleneck. Every individual and organization is free to deploy as much human and computing resources as they choose, without needing approval or support from us or anyone else. This allows the kind of scaling to billions of users that we see in the web and email, which are similarly decentralized. If the system provides sufficient value to users, as we expect, this approach might grow to global scale in a matter of months.

Why are you the people to do it?

This funding request is to help support my time in leading and organizing this project and doing elements of the work for which I am unable to find volunteers or other funding. I bring experience and expertise in all the necessary challenge areas, including credibility signals, community development, web application development, decentralized systems, and consensus process.

What is the impact of your idea on diversity and inclusiveness of the Wikimedia movement?

This project has no direct connection to diversity or inclusiveness. We are committed to working to addressing any indirect impacts which might arise.

What are the challenges associated with this project and how you will overcome them?

In general, we are reducing risk in this ambitious project by minimizing complexity and using a progression of small prototypes and experiments.

Challenges include:

  • Getting people to look at credibility data. Approach: make it salient and visually appealing. For example, see credibility network demo at https://credweb.org/viewer/ which has elements that are compelling and fun; it becomes salient when we let people add in the sources they care about and get to see how others judge those sources. We can bootstrap with existing wikipedia data feeds of likes and reverts as an initial proxy for credibility between wikipedians and draw on existing source credibility work for data on external sources.
  • Getting people to author credibility data. Once people are engaged in the data as a consumer, we hypothesize they will be motivated to engage as a producer to "correct" the data, to express what they believe or know. Additionally, a culture of contributing data to help the world, already common among wikipedians, should help. There are a range of ways to simplify or even gamify the contribution step, if necessary.
  • Harmful participants. Since we propose to primarily and initially use credibility data hosted on wikipedia user pages, to some degree the existing community safety mechanisms will still apply. We would like to demonstrate, however, that such mechanisms can be largely replaced by credibility data itself. In theory, people observed to do harm can be identified and have their actions demoted like non-credible content.
  • Getting people to trust the system. Approach: transparency and feedback. Make it clear which individuals are the source of each bit of data, with clear provenance and change tracking. Have the interface promote a virtuous cycle of improving the data and improving one's own credibility. This is similar to wikipedia's own mechanisms for being trustworthy (to people who know how it works).

How much money are you requesting?

10k USD for the Wikipedia aspects (outlined here) of the Credibility Signals work

How will you spend the money?

To support my time on this work

How long will your project take?

Up to 12 months, in three phases:

  • Phase 1 - up to four months - refine deployment plan, identify partners, settle issues within credweb CG
  • Phase 2 - about 2 months - active development of tools; release
  • Phase 3 - up to six months - revise and improve, based on user experienc

Have you worked on projects for previous grants before?

Yes, my work has been primarily grant funded for many years. Some highlights with web pages maintained by others: