Linking records of early aeronautics and aviation across data sets
Wikis for academic research ; history of technology ; show application of extension Cargo

Peter B. Meyer
Wikimedia DC

This wiki holds tables of information plus rich wikitext information combining records on an extended historical period. The data cover patents, inventors, clubs, firms, exhibitions, and conferences related to aeronautics and aviation globally from 1800 to 1916. During this period, when the idea of flying machines transitioned from a dream and a hobby into a new science and a startup industry. The data are on the wiki at, where any number of authorized users could edit it, their edits are tracked, and new changes are visible immediately to the others. The site uses MediaWiki with the Cargo extension for semantic functions to relate records to one another.

The wiki has 14,000 records of patents, and for most patents we have enough information about the inventors or applicants to link to a wiki page about these individuals. These link also to pages and data records about aeronautical clubs, aviation firms, exhibitions and conferences, and letters between aeronautical enthusiasts during this time period. At least 1000 individuals appear in these data sets or are otherwise thought to be significant to the invention and early development of airplanes. Increasingly we have data about the individuals. We are uploading data about aeronautical publications of the time from existing published bibliographies. Our research links these records from multiple sources to construct individual and organizational histories and will eventually yield networks of co-authorships and partnerships. Automatic reports can list an inventor’s patents on the inventor page, or similarly list patents in a technology classification on the page describing that classification.

The wiki adds a layer between the sources, which are numerous but not complete or entirely reliable, and the statistical conclusions. In that layer, the wiki pages, we can document decisions we made (e.g. inferences about who was referred to by a last name), and note data errors in the sources and how we corrected them. All this can be reviewed and audited on the wiki.

One challenge is that we do not have a unified definitive source of purely biographical information, such as birth date or full name. In time, specialists who are not part of the initial project can improve the underlying data and the matching – crowdsourcing. To help get there, we are starting to link to Wikidata and in the long run this wiki can be a convenient source for Wikidata, particularly for data on patents and inventors.

In the period, patent category systems are changing and the source records have problems of ambiguity and uncertainty – that is, the historian cannot always ascertain whether two persons or documents are the same, or which one is referred to in a particular primary source or secondary historical work. The design of the database and its interface is intended to enable careful memory of historical facts and conclusions, and to support the use of human decisions about each record or document, while enabling the creation of statistical measures from the data as it evolves and grows.

Such systems are auditable in the sense that each past edit to a record is remembered, and it is straightforward to check who made an edit and when, and to compare the versions before and after the change. In a wiki, the records are easily hyperlinked to one another, and properties of these links are also usable data, for example, a link might be from a record about a patent to one of several category systems to classify what the patent is about. Each item can be classified in a number of ways, and its network relations to other items recorded and developed with hyperlinks. This helps historical conclusions be grounded to evidence.

This work does not represent formal work of any organization, just the author, but it draws from many sources including more official presentations.

  • 15 mins
  • must have data projector and will use Web briefly if the option exists
  • 10-15
  • academic conferences: Social Science History Conference 2016; European Social Science History Conference 2018, World Economic History Congress 2018
  • Yes, I can do that; the work is mainly in the public domain anyway so I may need advice on that overlap
