Submissions:2018/Using a wiki to gather and link records of aeronautics and early aviation

From WikiConference North America
Jump to navigation Jump to search

This submission has been noted and is pending review for WikiConference North America 2018.



Title
Linking records of early aeronautics and aviation across data sets
Theme (optional)

Wikis for academic research ; history of technology ; show application of extension Cargo

Academic Peer Review option

No

Type of submission

Presentation

Author
Peter B. Meyer
E-mail address
econterms@gmail.com
Wikimedia username
econterms
Affiliation(s) (optional)
Wikimedia DC
Abstract


This wiki holds tables of information plus rich wikitext information combining records on an extended historical period. The data cover patents, inventors, clubs, firms, exhibitions, and conferences related to aeronautics and aviation globally from 1800 to 1916. During this period, when the idea of flying machines transitioned from a dream and a hobby into a new science and a startup industry. The data are on the wiki at http://aero.referata.com, where any number of authorized users could edit it, their edits are tracked, and new changes are visible immediately to the others. The site uses MediaWiki with the Cargo extension for semantic functions to relate records to one another.

The wiki has 14,000 records of patents, and for most patents we have enough information about the inventors or applicants to link to a wiki page about these individuals. These link also to pages and data records about aeronautical clubs, aviation firms, exhibitions and conferences, and letters between aeronautical enthusiasts during this time period. At least 1000 individuals appear in these data sets or are otherwise thought to be significant to the invention and early development of airplanes. Increasingly we have data about the individuals. We are uploading data about aeronautical publications of the time from existing published bibliographies. Our research links these records from multiple sources to construct individual and organizational histories and will eventually yield networks of co-authorships and partnerships. Automatic reports can list an inventor’s patents on the inventor page, or similarly list patents in a technology classification on the page describing that classification.

The wiki adds a layer between the sources, which are numerous but not complete or entirely reliable, and the statistical conclusions. In that layer, the wiki pages, we can document decisions we made (e.g. inferences about who was referred to by a last name), and note data errors in the sources and how we corrected them. All this can be reviewed and audited on the wiki.

One challenge is that we do not have a unified definitive source of purely biographical information, such as birth date or full name. In time, specialists who are not part of the initial project can improve the underlying data and the matching – crowdsourcing. To help get there, we are starting to link to Wikidata and in the long run this wiki can be a convenient source for Wikidata, particularly for data on patents and inventors.

In the period, patent category systems are changing and the source records have problems of ambiguity and uncertainty – that is, the historian cannot always ascertain whether two persons or documents are the same, or which one is referred to in a particular primary source or secondary historical work. The design of the database and its interface is intended to enable careful memory of historical facts and conclusions, and to support the use of human decisions about each record or document, while enabling the creation of statistical measures from the data as it evolves and grows.

Such systems are auditable in the sense that each past edit to a record is remembered, and it is straightforward to check who made an edit and when, and to compare the versions before and after the change. In a wiki, the records are easily hyperlinked to one another, and properties of these links are also usable data, for example, a link might be from a record about a patent to one of several category systems to classify what the patent is about. Each item can be classified in a number of ways, and its network relations to other items recorded and developed with hyperlinks. This helps historical conclusions be grounded to evidence.

This work does not represent formal work of any organization, just the author, but it draws from many sources including more official presentations.

Length of presentation
  • 15 mins
Special requests
  • must have data projector and will use Web briefly if the option exists
Preferred room size
  • 10-15
Have you presented on this topic previously? If yes, where/when?
  • academic conferences: Social Science History Conference 2016; European Social Science History Conference 2018, World Economic History Congress 2018
If you will be incorporating a slidedeck during your presentation, do you agree to upload it to Commons before your session, with a CC-BY-SA 4.0 license, including suitable attribution in the slidedeck for any images used?
  • Yes, I can do that; the work is mainly in the public domain anyway so I may need advice on that overlap
Will you attend WikiConference North America if your submission is not accepted?

Yes

Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with four tildes. (~~~~).

  1. Slowking3 (talk)
  2. James Hare (talk) 19:34, 19 August 2018 (UTC)
  3. Uncommon fritillary (talk) 01:07, 5 September 2018 (UTC)