Submissions:2014/Reconstructing the past with Mediawiki: Programmatic Issues and Solutions

From WikiConference North America
Jump to navigation Jump to search
Title of the submission
Reconstructing the past with Mediawiki: Programmatic Issues and Solutions
Themes (Proposal Themes - Community, Tech, Outreach, GLAM, Education)
Tech
Type of submission (Presentation Types - Panel, Workshop, Presentation, etc)
Curated Talk
Author of the submission
Shawn M. Jones
E-mail address
sjone@cs.odu.edu
Username
Shawnmjones
US state or country of origin
Virginia
Affiliation, if any (organization, company etc.)
Old Dominion University
Personal homepage or blog
http://ws-dl.blogspot.com
Abstract (at least 300 words to describe your proposal)

The Internet Archive attempts to reconstruct web pages via snapshots (Mementos) that are taken of pages at various points in time. Many pages change more frequently than the Internet Archive can capture them, meaning that some revisions of a given web page are lost forever. Mediawiki, however, has all past revisions of a given page, and also its associated external resources. This inspired the development of the Memento Mediawiki Extension as an improvement over the Internet Archive's "drive by" method of digital preservation where Mediawiki sites are involved.

While working on the Memento Mediawiki Extension, effort was put into reconstructing past revisions of each Wiki page. The existing software reconstructs the page text as per RFC 7089, but does not try to pull in past versions of images, JavaScript, CSS, and other external resources, because Mediawiki, as it exists, makes it difficult or impossible to load these resources at page generation time.

This curated talk will explore the problems of page reconstruction on the main web and detail the issues within the Mediawiki code that currently prevent and/or make it difficult to reconstruct the page in its totality as it looked at that revision. We will contrast how Mediawiki pages use the oldid method for accessing old pages while images and other uploaded content cannot be accessed this same way, even though the uploaded content's summary page does use the oldid method. We will examine how old revisions of the CSS and JavaScript can be viewed by humans, but not retrieved inline when needed by the Mediawiki. We will also indicate how Semantic Mediawiki lacks support for accessing past revisions altogether. After that, we will bring up some techniques that may solve these problems, including a uniform approach that may address all of them.

Where applicable, code examples and demonstrations will be provided.

Length of presentation/talk (see Presentation Types for lengths of different presentation types)
15 Minutes
Will you attend WikiConference USA if your submission is not accepted?
Yes
Slides or further information (optional)
Special request as to time of presentations
Any date/time is fine


Interested attendees

If you are interested in attending this session, please sign with your username below. This will help reviewers to decide which sessions are of high interest. Sign with four tildes. (~~~~).

  1. Rhododendrites (talk) --09:34, 10 April 2014 (EDT)
  2. Add your username here.