Submissions:2019/Powerful spreadsheet tools for Wikidata and Wikipedia

From WikiConference North America
Jump to navigation Jump to search

This submission has been accepted for WikiConference North America 2019.


Powerful spreadsheet tools for Wikidata and Wikipedia


Tech & Tools

Type of session:



(Ideal for either Friday hackathon training, unconference session, or Lightning Talk)

Contained within Google Sheets is a powerful capability that can be of tremendous help to Wikimedians working with sets of items and articles. Since Google Sheets lives in the cloud, it is network-aware, programmable and collaborative, providing the ability to automate many tasks: query Wikidata, evaluate articles and generate statistics, all without needing to learn coding.

This session will step through some examples of using the key "Wikipedia and Wikidata tools" module in Google Sheets, and some custom scripts created by the author to work with specific data sets. Some applications for which this has been useful and will be demoed:

  • Planning edit-a-thons by doing some forensic research into topical coverage and the overall quality of existing articles for a given domain. Given a list of articles of interest, the ORES scores can be generated and the overall quality of articles can be shown to a GLAM partner, for example, before the edit-a-thon starts. Reports can be generated, or cells can be conditionally color-coded to draw attention to certain needs.
  • Evaluating biographies and the existence of photos. This has been used by Wikimedia DC and others to help plan photography meetups at book festivals and other public events, so that we know which portrait headshots are in most need. In this way, we can prioritize which photos are completely missing or which ones need a replacement because they are low quality.
  • Examining the quality and completeness of Wikidata items and generating Quickstatements to fix problems. Users will see how application of logical statements can help optimize the performance of Google Sheets and reduce the load of API calls to MediaWiki.

At the end of the session, attendees will have a better understanding of how spreadsheets can be a powerful tool for working with sets of articles or items.

Academic Peer Review option:


Author name:

Andrew Lih

E-mail address:

Wikimedia username:


Affiliated organization(s):

Wikimedia DC

Estimated time:

45 minutes

Preferred room size:


Special requests:

Have you presented on this topic previously? If yes, where/when?:

Lightning talk at 2017 Wikimania

If your submission is not accepted, would you be open to presenting your topic in another part of the program? (e.g. lightning talk or unconference session)

Yes, unconference or lightning