Submissions:2025/Leveraging Robust Links to Prevent Link Rot on Wikipedia
This submission has been noted and is pending review for WikiConference North America 2025.
Title:
- Leveraging Robust Links to Prevent Link Rot on Wikipedia
Type of session:
- Lecture (15-30 min)
Session theme(s):
- Credibility
Abstract:
Unlike traditional scholarly publications, web pages and other online resources often suffer from content drift and link rot over time. Consequently, any references citing such resources lose credibility when resolving such references lead to error pages or content that have become unrelated to the context. The inherent problem here is the lack of expression of the temporal dimension when citing a web resource. While some references do expresses the intended date in the text, HTML anchor element does not have a standard way to encode this information in a machine-readable manner.
Robust Links is a proposed standard to bring this capability to anchor elements in the form of HTML5 data-* attributes. We introduce "data-originalurl", "data-versiondate", and "data-versionurl" attributes to express the original URL of the referenced resource, the date (or datetime) of the intended state or version of the resource, and optionally one or more known good archived version URLs at which the resource is preserved in the intended state. Currently, these attributes are not interpreted by user-agents in any special way, but JavaScript can be used to leverage them in the interim.
The current approach of fixing broken links requires running a bot which scans all the wiki pages regularly for their external links, checks the status of those links, and replaces any broken links with their corresponding archived versions from a web archive like Wayback Machine of the Internet Archive. The current approach has numerous inefficiencies and limitations. In this talk we will discuss potential approaches to integrate Robust Links in MediaWiki to make references born resilient against link rot.
- Specification draft: https://hvdsomp.info/robustlinks/
- Past slides: https://docs.google.com/presentation/d/11CN0k4TKlCFIII3sujuwN9oBy-cdvvFyvzGHCAhHvug/edit
- Reference implementation: https://github.com/iipc/robustlinks
Author name(s):
Wikimedia username(s):
Affiliated organization(s):
- Internet Archive, Los Alamos National Laboratory, Pacific Northwest National Laboratory, Old Dominion University, Data Archiving and Networking Services (DANS)
Estimated length of session
- 30
Will you be presenting remotely?
- I will present in-person
Okay to livestream?
- Livestreaming is okay
Previously presented?
- Yes, we presented at the WikiCredCon 2025
Special requests: