Submissions:2019/Making sites citation-friendly for Wikimedia - discussion and recommendations

From WikiConference North America
Jump to navigation Jump to search
This session is part of the WikiCite track.

This submission has been accepted for WikiConference North America 2019.


Making sites citation-friendly for Wikimedia - discussion and recommendations


Reliability of Information
+ Tech & Tools

Type of session:



This is being proposed primarily as a Monday CredCon topic.

One of the best developments in the Wikimedia ecosystem has been Citoid, a software module that automatically creates a well-formed citation template from an online source based on a URL. In Wikipedia's Visual Editor, it allows a user to paste in a URL pointing to a news, web or book site, causing the relevant parameters to be extracted – title, author, publication, publish date, et al. If all goes well, a fully formed citation is inserted into the Wikipedia article for the user and appears in the "References" section of the article, with a superscript number inserted into prose. However, the reality is that the success rates are highly varied. Each web site uses different metadata schemas and ways of expressing these fields. What if a work has multiple authors? What if the site is a legacy system that doesn't use new metadata standards, such as

At WikiCite 2017, Rob Fernandez and Andrew Lih showed that of the top 90 most cited news sources, the headline and publication name were extracted successfully more than 90% of the time. However, the date was extracted correctly only 60% of the time and the authors were only obtained 35% of the time. Of all the publications tested, only 32% of the news sites correctly extracted the four fields that make up a well-formed citation. There is tremendous room for improvement.

This session aims to examine and discuss how well Citoid performs on a range of popular news web sites, and we might work with news organizations and publishers to increase performance and compatibility with Citoid. The Citoid system depends on the open source citation manager Zotero and its "translator" framework. How well do publishers work with the Zotero translators, and do new ones need to be made to handle sites, or can news organizations supply additional metadata to make their work more compatible?

We hope to get a range of participants from the major platforms, from publishers and from the Wikimedia Foundation team that develops and supports Citoid, to come up with a list of best practices that can be delivered.

Previous work:

Academic Peer Review option:


Author name:

Andrew Lih

E-mail address:

Wikimedia username:


Affiliated organization(s):

Wikimedia DC

Estimated time:

45 minutes

Preferred room size:


Special requests:

Have you presented on this topic previously? If yes, where/when?:

Wikicite 2017 but not with news orgs

If your submission is not accepted, would you be open to presenting your topic in another part of the program? (e.g. lightning talk or unconference session)

Yes, unconference or other