User:GChriss/MediaDigitization

From WikiConference North America
Jump to navigation Jump to search


An informal, introductory session led by User:GChriss on the following novel digization techniques:

hOCR Workflow Tools

The hOCR Workflow Tools project is a collection of tools to facilitate generation of text-searchable digital documents and is particularly useful in contexts where traditional OCR techniques would fare poorly (e.g. handwritten notes). It's implemented via two Inkscape extensions:

Inkscape Extension: Export Image Overlay Text as hOCR
Inkscape Extension: Create Multi-Page PDF from hOCR HTML Directory

Accurate text-searchable documents bring new life and layers of reader engagement to source materials.


The BookLiberator

The BookLiberator is an innovative, lightweight book-scanner design that's feature-complete, including scanning speed, with larger, traditional models.  This mini-topic covers a number of design changes from the original-project design and new media processing tools summarized in the following blog post:

http://gchriss.tumblr.com/post/84946122863/bookliberator

A BookLiberator lightning talk transcript is available:

https://gitorious.org/bookliberator-libre/pages/Home

One of the WiFi-active, 'anti-motion'-triggered Canon PowerShot cameras used in the design refresh will be on display.


High-Resolution Imaging

Using image sensors with a high pixel density (defined as the number of sensor pixels divided by total sensor size) combined with high-resolving-power lenses it's possible to image arbitrary surfaces in much higher detail than using document scanning or traditional macro photography techniques.  For an example image created using this technique please see:

https://web.archive.org/web/20140106062717/http://media.openvideo.pro/u/gchriss/m/docuzoom-microscale-1-dollar-bill/

Beyond an introduction to the novel technique and how it can be applied in historical research contexts, a working "works/doesn't yet work/future work" status update will be presented with a particular focus on large-document automated scanning.  An Elphel 353L camera as well as a A10-OLinuXino-LIME interfaced with an OV5642 image sensor via GPIO pins will be on display.


Open Video Reference Build

The ‘Open Video Reference Build’ is a set of tools designed to facilitate working with open video in multiple contexts such as software development, live-streaming, A/V conferencing, video editing, and machine recognition.  It currently consists of three BASH scripts that create a series of well-defined software packages running in a libre, long-term-support operating system: Trisquel.

Video can be difficult to work with.  The Open Video Reference Build is designed to reduce as much complexity as possible without sacrificing build precision or extensibility.  See: https://gitorious.org/openvideo_reference_build