Difference between revisions of "2019/Grants/Explicit credibility signal data on wikipedia"

From WikiConference North America
< 2019‎ | Grants
Jump to navigation Jump to search
m
(Moved to https://wikiconference.org/wiki/2019/Grants/Wikipedia_deployment_of_credibility_signals_app)
 
(11 intermediate revisions by the same user not shown)
Line 1: Line 1:
  +
Moved to https://wikiconference.org/wiki/2019/Grants/Wikipedia_deployment_of_credibility_signals_app
{{WCNA 2019 Grant Submission
 
|name=Sandro Hawke
 
|username=Sandro_Hawke
 
|email=sandro{{@}}w3.org
 
|resume=* Larger project website: https://credweb.org
 
* Our analysis: https://www.w3.org/2018/10/credibility-tech/
 
* My resume: https://hawke.org/resume-2020/
 
|geography=global
 
|type=Other
 
|idea=Let's connect wikipedians to the emerging ecosystem of credibility data. Let's draw on their expertise and diligence to create community-sourced credibility data, letting individual wikipedians express credibility signals and interact with other credibility data.
 
 
Let's give people the tools to see how this credibility data relates to their work on Wikipedia, giving them more insight into what sources are reliable. As wikipedians learn to navigate the credibility landscape, they can increasingly help others making their own decisions about source reliability.
 
 
Once this seed community has tested and refined the credibility signals process, this approach has the potential to rapidly grow to global scale.
 
 
SPECIFICS TBD.
 
|importance=For Wikipedia, this idea promises to help in the fight against misinformation, making it easier for wikipedians and the broader world to collaborate in identifying credible and non-credible sources.
 
 
For the world at large, the stakes are much higher, as this approach has the potential to turn the tide against misinformation across all technology platforms.
 
|inprogress=We are developing the relevant concepts and tools (as seen at https://credweb.org) but have not begun deployment in the wikipedia community or tooling to work with wikipedia data feeds.
 
|relevance=There are many connections between this Credibility Signals work and Wikipedia:
 
 
* Wikipedia has always needed to be able to separate fact from fiction. While it does this very well, these tools might make the task easier. Specifically, this can rapidly highlight which sources have unacceptably low credibility and help with sorting out why particularly sources are viewed as credible or not credible.
 
* Wikipedia has always needed to reduce harm done by careless and malicious users. It does this very well, but again, these tools might make the task easier, assisting in tracking and management of the reputation of users, which can be used in modifying their privileges.
 
* Because of its great expertise in these fields, the Wikipedia community is an excellent proving ground for these technologies. Flaws in the technologies that might eventually lead to failure in the broader media ecosystem are likely to be spotted very quickly by wikipedians, giving time to improve the designs before wider deployment.
 
|impact=If successful, this project will show a clear way that people can collaborate online in protecting themselves and their communities from misinformation. This method can be adopted by communities and platforms around the world to greatly reduce misinformation and other online harms.
 
|scalability=Yes, this plan is phenomenally scalable.
 
 
It is based on existing social practices, where each individual manages their own credibility assessment process (deciding what to believe), using what they can glean from their surroundings, including their social network. This process scales linearly with the number of individuals, with each individual deciding how much of their own resources to devote to each assessment they make. Adding computers and networking to this existing human process should greatly improve the efficiency and accuracy of this process, without altering this scaling behavior.
 
 
In its approach to decentralization, this design avoids any central bottleneck. Every individual and organization is free to deploy as much human and computing resources as they choose, without needing approval or support from us or anyone else. This allows the kind of scaling we see in the web and email, which are similarly decentralized, but much faster since the underlying infrastructure is already in place. If the system provides sufficient value to users, as we expect, this approach might grow to global scale in a matter of months.
 
 
The pace of scaling may also be quite rapid because it naturally spreads over social connections and social media. While it relies on software, which is often slow to develop, the software can come from any source, reducing this risk. Because of the social connections, the person-to-person spread may resemble the spread of ideas (memes) more than the slower (but still rapid) spread of technology platforms. At this point, in April 2020, we are perhaps all-to-familiar with the power of things which are able to spread person-to-person, out of control.
 
|people=I bring experience and expertise in all the necessary challenge areas, including credibility signals, community development, web application development, decentralized systems, and consensus process.
 
|inclusiveness=This project has no direct connection to diversity or inclusiveness. We do not foresee any specific indirect impact. We are aware that decentralized systems have a mixed track record on these issues, with Mastodon showing some promise, while other efforts use decentralization to route around platform Trust & Safety enforcement actions. Because our decentralization technology builds on top of existing platforms, re-using their social features, rather than building it's own (perhaps using cryptographic techniques) we do not expect it to manifest that difficulty.
 
|challenges=This is an ambitious piece of an ambitious project. We are reducing risk by maximizing simplicity and using a progression of small prototypes and experiments.
 
 
Challenges include:
 
 
* '''Getting people to look at credibility data'''. Approach: make it visually appealing and salient. For example, see credibility network demo at https://credweb.org/viewer/ which has elements that are compelling and fun; it becomes salient when we let people add in the sources they care about and get to see how others judge those sources. We can bootstrap with existing wikipedia data feeds of likes and reverts as an initial proxy for credibility between wikipedians and draw on existing source credibility work for data on external sources.
 
* '''Getting people to author credibility data'''. Once people are engaged in the data as a consumer, we hypothesize they will be motivated to engage as a producer to "correct" the data, to express what they believe or know. Additionally, a culture of contributing data to help the world, already common among wikipedians, should help. There are a range of ways to simplify or even gamify the contribution step, if necessary.
 
* '''Harmful participants'''. Since we propose to primarily and initially use credibility data which hosted on wikipedia user pages, to some degree the existing community safety mechanisms will still apply. We would like to demonstrate, however, that such mechanisms can be largely replaced by credibility data itself. In theory, people observed to do harm can be identified and have their actions demoted like non-credible content.
 
* '''Getting people to trust the system'''. Approach: transparency and feedback. Make it clear which individuals are the source of each bit of data, and have the interface promote a virtuous cycle of improving the data and improving one's own credibility. This is similar to wikipedia's own mechanisms for being trustworthy (to people who know how it works).
 
 
|cost=10k USD for the Wikipedia aspects (outlined here) of the Credibility Signals work
 
|expenses=To support my time on this work
 
|time=2-8 months, with a release to the community within two months of the grant, and then ongoing improvements for up to 6 months, depending on how quickly the approach is adopted.
 
|previous=Yes, my work has been primarily grant funded for many years. Some highlights with web pages maintained by others:
 
* 2018 Google (see "W3C") https://www.blog.google/outreach-initiatives/google-news-initiative/elevating-quality-journalism/
 
* 2013 Knight Foundation https://knightfoundation.org/articles/introducing-crosscloud-project-get-your-data-out-silos/
 
* 2012 NSF https://www.nsf.gov/awardsearch/showAward?AWD_ID=1313789
 
* 2005 DARPA http://xml.coverpages.org/ni2005-02-21-a.html
 
}}
 

Latest revision as of 05:09, 5 April 2020