Difference between revisions of "User:GChriss/MediaDigitization"

Revision as of 03:03, 1 June 2014

An informal, introductory session led by User:GChriss on the following novel digization techniques:

hOCR Workflow Tools

The hOCR Workflow Tools project is a collection of tools to facilitate generation of text-searchable digital documents and is particularly useful in contexts where traditional OCR techniques would fare poorly (e.g. handwritten notes). It's implemented via two Inkscape extensions:

Inkscape Extension: Export Image Overlay Text as hOCR

Inkscape Extension: Create Multi-Page PDF from hOCR HTML Directory

Accurate text-searchable documents bring new life and layers of reader engagement to source materials.

The BookLiberator

See http://gchriss.tumblr.com/post/84946122863/bookliberator

High-Resolution Imaging

Using image sensors with a high pixel density (defined as the number of sensor pixels divided by total sensor size) combined with high-resolving-power lenses it's possible to image arbitrary surfaces in much higher detail than using document scanning or traditional macro photography techniques. For an example image created using this technique please see:

http://media.openvideo.pro/u/gchriss/m/docuzoom-microscale-1-dollar-bill

Beyond an introduction to the novel technique and how it can be applied in historical research contexts, a working "works/doesn't yet work/future work" status update will be presented with a particular focus on large-document automated scanning. An Elphel 353L camera as well as a A10-OLinuXino-LIME interfaced with an OV5642 image sensor via GPIO pins will be on display.

Open Video Reference Build

The ‘Open Video Reference Build’ is a set of tools designed to facilitate working with open video in multiple contexts such as software development, live-streaming, A/V conferencing, video editing, and machine recognition. It currently consists of three BASH scripts that create a series of well-defined software packages running in a libre, long-term-support operating system: Trisquel.

Video can be difficult to work with. The Open Video Reference Build is designed to reduce as much complexity as possible without sacrificing build precision or extensibility. See: https://gitorious.org/openvideo_reference_build

@@ Line 4: / Line 4: @@
 ===hOCR Workflow Tools===
-The hOCR Workflow Tools project is a collection of tools to facilitate generation of text-searchable digital documents and is particularly useful in contexts where traditional OCR techniques would fare poorly (''e.g.'' handwritten notes) implemented ''via'' two [http://www.inkscape.org/en/ Inkscape] extensions:
+The hOCR Workflow Tools project is a collection of tools to facilitate generation of text-searchable digital documents and is particularly useful in contexts where traditional OCR techniques would fare poorly (''e.g.'' handwritten notes).  It's implemented ''via'' two [http://www.inkscape.org/en/ Inkscape] extensions:
 :[https://gitorious.org/hocr-workflow/inkscape-hocr Inkscape Extension: Export Image Overlay Text as hOCR]
 :[https://gitorious.org/hocr-workflow/inkscape-hocrpdf Inkscape Extension: Create Multi-Page PDF from hOCR HTML Directory]
+Accurate text-searchable documents bring new life and layers of reader engagement to source materials.
 <br />