Difference between revisions of "Submissions:2019/Making sites citation-friendly for Wikimedia - discussion and recommendations"

From WikiConference North America
Jump to navigation Jump to search
Line 1: Line 1:
{{WCNA 2019 Session Submission
{{WCNA 2019 Session Submission
|theme=Reliability of Information<br />+ Tech & Tools<br />
|theme=Reliability of Information<br />+ Tech & Tools<br />

Revision as of 05:22, 7 October 2019

This submission has been accepted for WikiConference North America 2019.


Making sites citation-friendly for Wikimedia - discussion and recommendations


Reliability of Information
+ Tech & Tools

Type of session:



This is being proposed primarily as a Monday CredCon topic.

One of the best developments in the Wikimedia ecosystem has been Citoid, a software module that automatically creates a well-formed citation template from an online source based on a URL. In Wikipedia's Visual Editor, it allows a user to paste in a URL pointing to a news, web or book site, causing the relevant parameters to be extracted – title, author, publication, publish date, et al. If all goes well, a fully formed citation is inserted into the Wikipedia article for the user and appears in the "References" section of the article, with a superscript number inserted into prose. However, the reality is that the success rates are highly varied. Each web site uses different metadata schemas and ways of expressing these fields. What if a work has multiple authors? What if the site is a legacy system that doesn't use new metadata standards, such as schema.org?

At WikiCite 2017, Rob Fernandez and Andrew Lih showed that of the top 90 most cited news sources, the headline and publication name were extracted successfully more than 90% of the time. However, the date was extracted correctly only 60% of the time and the authors were only obtained 35% of the time. Of all the publications tested, only 32% of the news sites correctly extracted the four fields that make up a well-formed citation. There is tremendous room for improvement.

This session aims to examine and discuss how well Citoid performs on a range of popular news web sites, and we might work with news organizations and publishers to increase performance and compatibility with Citoid. The Citoid system depends on the open source citation manager Zotero and its "translator" framework. How well do publishers work with the Zotero translators, and do new ones need to be made to handle sites, or can news organizations supply additional metadata to make their work more compatible?

We hope to get a range of participants from the major platforms, from publishers and from the Wikimedia Foundation team that develops and supports Citoid, to come up with a list of best practices that can be delivered.

Previous work: https://docs.google.com/presentation/d/1HI1FMZQ1Wkg15d1ny8Qynfq4JkvPA_C46r7cOVm2xDM/edit

Academic Peer Review option:


Author name:

Andrew Lih

E-mail address:


Wikimedia username:


Affiliated organization(s):

Wikimedia DC

Estimated time:

45 minutes

Preferred room size:


Special requests:

Have you presented on this topic previously? If yes, where/when?:

Wikicite 2017 but not with news orgs

If your submission is not accepted, would you be open to presenting your topic in another part of the program? (e.g. lightning talk or unconference session)

Yes, unconference or other