Skip to main content



Digital Books Just Keep Getting Better…

Cornell has engaged a workflow that allows for the improvement of digital books made through our partnership with Google and deposited in HathiTrust.  The workflow is responsive to alerts from HathiTrust regarding the need for improvement of specific pages, and also engages the Single Page Insertion and Replacement workflow that Google has set up for library partners.  The results so far are very positive: complete and correct digital volumes, satisfied staff from multiple institutions, and some very happy HathiTrust patrons.

The process begins most often with a patron from a member institution in HathiTrust.  While using a digital book, they may notice problems with the copy similar to those in list below:

  • foldouts were not unfolded during the scan process, resulting in important diagrams, charts, maps or pictures being lost
  • a particular page might be skipped
  • the page was moved during image capture, yielding a blurry image
  • an operator’s hand or book clamps might have been caught in the frame
full color map

Rhea’s Capture of a map from Indian Railways in 1921-22 in 2 vols. Administration Report.

HathiTrust has made a feedback link available on every page, located near the middle of the lower navigation bar.  The link yields a pop-up form that captures a few quick details: the only entry required of the patron is an email address (highly recommended to enter this, since the disposition of the issue will be reported back to this address) a radio button, a few check boxes and an optional note.  (The page URL is captured automatically from the browser, and not required from the user.)  Thus with minimal text entry and a few clicks, the patron makes an informative report that opens a ticket with HathiTrust.  Staff at HathiTrust respond to the ticket, and facilitate corrective measures.  In the case of books created through the Google Library partnership (comprising most of Cornell’s deposits) staff at HathiTrust first contact Google directly to see if the problematic pages can be rectified.  If not, they will contact the HathiTrust member institution and let them know that the digital book needs improvement.

diagram of steam engine

Shakyha’s capture of a foldout from Parovozy tipa “Dekapod” (1-5-0) postroennye v Ameriki︠e︡ dli︠a︡ Russkikh Kazennykh Zheli︠e︡znykh Dorog.

At Cornell, we respond to these alerts by first paging the physical object to verify that the missing digital content is actually present in the physical book.  The vast majority of the time, we can supply the missing content, so we next contact Google via a form specific to library partners.  Google staff apprise us of appropriate naming for the pages involved, which our digitization unit adheres to as they capture the missing content in digital form, and the resulting files are uploaded to a site available to Google’s mechanized crawler.  The pages are uploaded via the crawler, integrated into the appropriate volume at Google Books, and we notify HathiTrust.  HathiTrust downloads the corrected content from Google, and the results are then seen in the HathiTrust Digital Library.  The patron is notified that the missing pages are available.  Patrons have shared with us their enthusiastic feedback at seeing the corrected content.

color map of railroads

Shakhya’s capture of a foldout from Annual report. Pennsylvania Railroad.

So far Cornell has corrected about 28 volumes, correcting only one page in some, and literally dozens or scores in others.  We have supplied foldouts, missing pages, maps, color diagrams, and pocket material. To give you examples of the work, I’ve included a few low resolution copies throughout this post; each links to the page within HathiTrust.  To admire the detail you may want to assure you are logged into HathiTrust, and then choose to download the page as a PDF (look for the link in the left hand navigation bar).  As with all inter-institutional accomplishments, thanks go to a wide cast with a diverse set of skills.  At Google, Maya has been extremely helpful in working through some problematic cases, and getting all submissions inserted correctly and promptly.  From the HathiTrust side, Kat Hagedorn’s cheerful optimism and careful shepherding never fails to topple obstacles.  Cornell’s own Digital Media Group (DMG) produces beautiful imaging work, thanks to Shakhya Bodhiwamsa, Rhea Garen, and Bronwyn Mohlke.  Due to the coordinated effort, patrons the world around can view this work at HathiTrust, one improved volume at a time.

 

Comments

Comments are closed.

Admin