Skip to main content



Research Sharing and Profiling Platforms

In the past year, I’ve spoken with a couple of faculty groups about research sharing and profiling platforms: what they do, and some issues and alternatives to consider. Since there is a steady stream of news about these tools – most recently that ResearchGate has restricted access to 1.7 million articles, as well as plans to launch a non-profit alternative dubbed ScholarlyHub – I thought I’d share here some of that information that I’ve shared with faculty.

What research sharing and profiling platforms do…

What I’m loosely calling “research sharing and profiling platforms” typically perform at least a couple of the following functions:

  • Present a public view (profile) of a person’s scholarship
  • Make it possible to share publications freely
  • Network – find experts, find collaborators
  • Demonstrate impact via download or citation counts or other metrics
  • Notify users of new content in their areas of interest (current awareness)

Some common platforms and their functions:

In no particular order and with no endorsements expressed or implied, these are some of the platforms that are either heavily used by faculty, or might merit consideration. (Click to enlarge)

Issues to consider in choosing and using these tools:

  • Essential functions and effort. What do they want from these tools, and what work are they willing to do to realize those results?
  • Sponsorship and control. Who owns and controls the service? What motivates the sponsor to provide the service, and what are they doing with the (their!) information? What are the terms of service? Who controls and updates the content?
  • Copyright. When it comes to sharing publications online, just because they can, doesn’t mean… they can. It might be technologically possible to upload files to a site for sharing, but that doesn’t make it legal, and publishers are starting to crackdown.

There are alternative tools for each of the functions listed in the table, some of which may have more palatable terms of use or keep their data out of the hands of for-profit entities. The top recommendations I’d make to faculty are to sign up for an ORCID iD and use it (a complete ORCID record makes for a nice, clean profile that travels with them throughout their career, in addition to unambiguously identifying their works as theirs), proactively manage their rights as authors, and to publish open access or share legal copies of their works using one of CUL’s repositories (eCommons, DigitalCommons@ILR, or SHA’s Scholarly Commons, depending on their affiliation). There are also alternatives for measuring impact and current awareness, that may be preferable to the for-profit tools.

I am happy to chat more about this topic with anyone who is interested, offline. If you are a liaison to faculty who might be interested, please feel free to get in touch – I’m happy to share materials or attend a meeting with you.

~ Gail Steinhart, Scholarly Communication Librarian

 

SCWG: eCommons and Unit Libraries (includes survey link)

One of the 2017 projects of the Scholarly Communication Working Group is gaining a better understanding and better supporting the efforts of unit libraries in recruiting content for deposit in eCommons. eCommons is Cornell’s general-purpose Institutional Repository (IR), available for the use of the entire Cornell community (more information about eCommons is available on the eCommons LibGuide). Some individual colleges also have their own IRs with their own branding. The School of Industrial and Labor Relations is served by DigitalCommons@ILR, the School of Hotel Administration is served by SHA Scholarly Commons, and the Law School is served by Scholarship@Cornell Law.

Over the summer the Scholarly Communications Working Group interviewed staff from across the library to see how eCommons could work better for them, these are the questions we heard the most:

Who can put things in eCommons, and who can look at them?

Anyone at Cornell may put content in eCommons! And anything in eCommons is viewable to the Cornell community or to the world at large!

What cannot go in eCommons?

Sensitive data, dynamic web content (eg: flash videos or databases), and large files (3GB or larger) cannot go in eCommons. Content in eCommons must conform to copyright law and to FERPA regulations, and our staff can help you navigate what can or cannot be deposited.

Why should I put content in eCommons?

eCommons is durable and stable! The library is committed to keeping the content that’s in eCommons, and we provide permanent links to that content. And unlike privately held platforms (eg: academia.edu or ResearchGate), because the library runs eCommons, it will not be sold or monetized.

eCommons is free! Unlike a personal website, you do not have to pay for space to store material on eCommons.

Who is putting things in eCommons, and what kinds of things are there?

Schools, departments, and centers from all across the Cornell community are represented in eCommons! Here are some examples from around campus: the Graduate School deposits Theses and Dissertations, many departments deposit newsletters and reports, Cooperative Extension has all kinds of informative pamphlets, Cornell Botanic Gardens has hundreds of pictures of plants, the Knight Institute for Writing in the Disciplines mandates inclusion of writing assignments. There are also entire textbooks by Cornell faculty, and an amazing series of lectures delivered by distinguished visitors.

What’s next?

As a result of the interviews, we’ve made numerous additions and updates to our help pages (the LibGuide). The Scholarly Communications Working Group and eCommons staff would also love to answer your additional questions about eCommons. We are currently running a survey to find out both WHAT you want to know about eCommons, and HOW you want to find out about it. Please complete the survey to help inform future outreach efforts!

The survey will remain open through Friday, November 10, 2017.

Scholarly Communication Working Group (SCWG) Supporting the Collection Efforts of Unit Libraries – eCommons members:
Eileen Keating
Sarah Kennedy
Chloe McLaren
Gail Steinhart
Drew Wright

Updates from Digital Consulting and Production Services – October 2017

News

 

Audiovisual Preservation Initiative

Digital Consulting & Production Services, in partnership with CIT and other Cornell stakeholders, is leading an effort to determine audiovisual preservation needs campus-wide. Cornell University has vast holdings of unique audiovisual assets vital to its mission “to discover, preserve, and disseminate knowledge.” This institutional legacy now faces a very real and growing threat due to audiovisual media degradation and playback obsolescence, which if left unattended, will result in the loss of priceless media assets. Over a period of 15 months, the Audio Video Preservation Group conducted a campus-wide, collection-level survey of Cornell’s unique and/or rare AV items, resulting in the identification of over 220,000 items, and compiled its findings and recommendations into a report.

Read the Audiovisual Preservation Initiative Survey Report.

 

Service Reminder

Through the Arts & Sciences Teaching Digitization program, Cornell University Library will digitize various material types in support of teaching, for instructors within the College of Arts & Sciences. This service includes metadata creation and online delivery of images, audiovisual, and other visual resources. This is a free service to the College faculty and researcher, and aims to support the teaching mission. For more information, please contact us at dcaps@cornell.edu.

 

Featured Collections

 
Cornell Collection of Blaschka Invertebrate Models

Cornell University is one of a handful of academic institutions in the United States with a collection of glass invertebrates created by renowned 19th century glass artists Leopold and Rudolf Blaschka. An earlier collaboration between the Department of Ecology & Evolutionary Biology and Mann Library led to the creation of an online gallery of the Blaschka collection, available for use by scholars and educators. A 2016 Digital Collections in Arts & Sciences Grant awarded to Drew Harvell, Professor of Ecology & Evolutionary Biology, and Nelson Hairston, Frank H.T. Rhodes Professor of Environmental Science, has expanded and improved this remarkable online collection.

 

Hill Ornithology Collection

This collection traces the development of ornithological illustration in the 18th and 19th centuries and highlights the changing techniques — from metal and wood engraving to chromolithography — during that period. These selected illustrations, by renowned ornithologists and artists such as John James Audubon, John Gould, and Joseph Wolf, are part of the Hill Ornithology Collection held by the Division of Rare & Manuscript Collections.

 

Willard D. Straight in Korea

Willard D. Straight worked in Korea as a Reuters correspondent during the Russo-Japanese War in 1904-05, and later as a U.S. diplomat. He took numerous photographs of landscapes, urban scenes, cultural phenomena, historic events, and people. These photographs, as well as the postcards he collected during his time there, offer a rare example of western perspectives on Korea during the early 20th century. The original materials are part of the Willard Dickerman Straight Papers, held by the Division of Rare & Manuscript Collections.

 

About DCAPS

Cornell University Library, a pioneer in the creation and management of digital resources, has assembled a team of experts to support digital scholarship initiatives for Cornell’s faculty, staff and community partners. Specializing in high-end digitizationmetadata customization and creation, and online delivery, the DCAPS staff is recognized worldwide for creating innovative collections in support of instructional and research activities. Whether you are seeking to digitize content for a class, or looking to push the envelope on new modes of scholarly communication, DCAPS is here to help.

 

Follow us on Twitter & Instagram

https://dcaps.library.cornell.edu | dcaps@cornell.edu

 

Updates from Digital Consulting & Production Services (DCAPS) – August 2017

August 2017

Cornell University Library, a pioneer in the creation and management of digital resources, has assembled a team of experts to support digital scholarship initiatives for Cornell’s faculty, staff and community partners. Specializing in high-end digitizationmetadata customization and creation, and online delivery, the DCAPS staff is recognized worldwide for creating innovative collections in support of instructional and research activities. Whether you are seeking to digitize content for a class, or looking to push the envelope on new modes of scholarly communication, DCAPS is here to help.

News 

A warm welcome to the incoming class of 2021! We hope all of your semesters are off to a good start.

This year’s awards for the Grants Program for Digital Collections in Arts & Sciences have been announced and you can read about them in the Cornell Chronicle. There were over a dozen fabulous projects that were proposed, so thank you to all that were involved in the application process. This year’s awards include the first to a graduate student application, which is exciting.  Learn more about the Grants Program for Digital Collections in Arts and Sciences.

 DCAPS has added digital forensics to its services, which includes forensic analysis of older media types, such as floppy discs, CD-ROMs, hard drives and other digital files. This is all part of our new Audiovisual Preservation Lab located in 214 Olin Library, where we routinely digitize and offer consulting for media projects and publications. For more information, please contact dcaps@cornell.edu

Service Reminder

The Arts & Sciences Teaching Digitization program offers free digitization of teaching materials for faculty and instructors in the College of Arts & Sciences in a variety of formats, including AV material, slides, photographs, etc. The program aims to support the teaching mission of the Arts & Sciences faculty. (Please inquire at dcaps@cornell.edu).  Learn more about Arts & Sciences Teaching Digitization.

For all of your digitization and production needs, please visit dcaps.library.cornell.edu.

Featured Collections

Digitizing Tell en-Naṣbeh (Biblical Mizpah of Benjamin)

Tell en-Naṣbeh, ancient Mizpah of Benjamin, is located 12 km north of Jerusalem in the West Bank. Approximately two thirds of this three hectare, primarily Iron Age, site was excavated by a team from Pacific School of Religion in Berkeley, CA, between 1926–1935 under the direction of W. F. Badé. Tell en-Naṣbeh is one of the most broadly excavated sites in the southern Levant, making it of great importance for those interested in studying house construction, settlement planning and social organization. The full set of 1:100 plans has, until now, only been available to those able to travel to Berkeley. 50 of the plans are now available online, thanks to funding from the Grants Program for Digital Collections in Arts and Sciences.

Latin American Journals Project

The Latin American Journals Project was established by Tom McEnaney (former Assistant Professor of Comparative Literature at Cornell) in collaboration with Cornell University Library’s Digital Consulting & Production Services in order to provide a hub for scholars across the globe to more easily access literary and cultural journals published in the Hispanophone Caribbean and Latin America during the late 19th and early 20th centuries.

Punk Flyers

This project involves the digitization of a collection of 1,800 punk music flyers owned by Cornell University Library’s Division of Rare and Manuscript Collections. The resulting digital collection will make a significant body of unique, ephemeral materials broadly available to scholars interested in the development of punk and post-punk music, culture, aesthetics, fashion, and politics from the late 1970s through the early 2000s. A selection of flyers is currently available online.

 

For more information and the latest updates, visit our website, and follow us on Twitter and Instagram.

New Arts & Sciences Grants Announced

We were delighted to receive twelve strong proposals for the seventh year of the Visual Resources Grants Program and are pleased to support three exciting projects, including one from a graduate student.  Through these initiatives, we aim to expand our digital collections for research and teaching and contribute to the burgeoning field of scholarship in the digital humanities through the integration of new research methods, innovative data visualization, and tools that enable novel ways of analysis and interpretation.

19th Century Prison Reform Collection
PI: Katherine Thorsteinson, PhD candidate, English

In the early to mid-19th century, U.S. criminal justice was undergoing massive reform. The state prisons that had emerged out of earlier reform efforts were becoming increasingly crowded, diseased, and dangerous. This collection draws together invaluable documents surrounding the emergence and development of the Auburn and Pennsylvania Systems. These documents help chart the debate between these alternate prison reformation systems, the emergence of prison labor, the theological origins of the American prison, and the historical implications between slavery and mass incarceration. The collection supports research and teaching for a wide variety of disciplines, and is especially significant given the intensity with which the public is now concerned with the current state of incarceration.

NYS Historical Dendrochronology Project
PI: Carol Griggs, Classics, Tree-Ring Laboratory
Collaborators: Sturt Manning, Classics, Archaeology, Tree-Ring Laboratory; Brita Lorentzen, Tree-Ring Laboratory; Cynthia Kocik, Tree-Ring Laboratory

The New York State Historical Dendrochronology Project collects and uses dendrochronology to date wood samples from structures across upstate New York to provide precise dates of their construction, modification, and building history. The collection is comprised of approximately 80 sites and 1000 samples ranging from 1448 to 2016. The digitization of the physical samples and associated materials will provide easy access to this collection and will be of value to those in dendrochronology, history, archaeology, environmental and earth sciences, as well as museums, historical societies, and individuals interested in the history of the region. The digital collection will be used for historical, ecological, and climatological research on local, regional, and larger spatial scales and on annual to multi-century time scales.

Seneca Haudenosaunee Archaeological Materials, circa 1688-1754
PI: Kurt Jordan, Anthropology
Collaborator: Dusti Bridges, MA candidate, Archaeology

The goal of this project is to digitize and introduce archaeologically-recovered materials from two late 17th and early 18th century Seneca (O-non-dowa-gah) Haudenosaunee (Six Nations Iroquois) sites located near Geneva, New York into an online platform that will be meaningful for descendant communities as well as researchers in Anthropology, History, and American Indian and Indigenous Studies. It will provide information on archaeological materials from a poorly-understood era to researchers, serve as a resource for education on the indigenous history of New York, and most importantly provide a means for descendant communities to access and explore their heritage.

The grants program aims to support collaborative and creative use of resources through the creation of digital content of enduring value to the Cornell community and scholarship at large. The program is funded by the College of Arts of Sciences and coordinated by Cornell University Library (CUL). The Arts & Sciences Visual Resources Advisory Group oversees the visual resources program and CUL’s Digital Consulting and Production Services (DCAPS) plans and implements the grant-funded projects. In addition to the grants program, the Arts and Sciences digitization program continues to support instructional needs by providing timely and convenient digitization services, especially for the History of Art and Visual Studies, Anthropology, Archaeology, History, Classics, and Music departments.  By digitizing instructional materials and loading them into image databases, we provide campus-wide access to these resources in support of academic goals and allow their reuse and repurposing.

We are grateful for the contributions of Tre Berney, Jasmine Burns, Jenn Colt, Dianne Dietrich, Rhea Garen, Simon Ingall, and Melissa Wallace as they collaborated with faculty in preparing the proposals.

Cornell Collections of Antiquities https://antiquities.library.cornell.edu/

Updates from Digital Consulting & Production Services (DCAPS) – June 2017

June 2017

Cornell University Library, a pioneer in the creation and management of digital resources, has assembled a team of experts to support digital scholarship initiatives for Cornell’s faculty, staff and community partners. Specializing in high-end digitizationmetadata customization and creation, and online delivery, the DCAPS staff is recognized worldwide for creating innovative collections in support of instructional and research activities. Whether you are seeking to digitize content for a class, or looking to push the envelope on new modes of scholarly communication, DCAPS is here to help. Contact us at dcaps@cornell.edu to learn more about our initiatives, services and projects.

News

Warm wishes and congratulations to the Class of 2017!

Applications for the Grants Program for Digital Collections in Arts & Sciences are currently being reviewed and budgeted out for this coming academic year. Thanks to all of those who submitted great proposals. We are scheduled to announce to awards in mid-June. The program welcomes large-scale projects that integrate new research methods, innovative data visualization, and tools that enable novel means of analysis and interpretation. This grant program is open to Cornell faculty in the College of Arts and Sciences, as well as to Cornell graduate students under specific requirements. Learn more about the Grants Program for Digital Collections in Arts and Sciences.

Featured Service

The Arts & Sciences Imaging for Teaching program offers free digitization of course materials originating in a variety of formats, including chapters of books, AV material, slides, photographs, etc. The program aims to support the teaching mission of the Arts & Sciences faculty. (Please inquire at dcaps@cornell.edu). Learn more about Arts & Sciences Imaging for Teaching.

Featured Collections

Indonesian Music Archive

The Indonesian Music Archive consists of approximately 193 hours of audio recordings made in Indonesia.

 

Adler Hip Hop Archive

Compiled by Bill Adler, a noted journalist and founding VP of Publicity at Def Jam Recordings, the Adler Hip Hop Archive contains thousands of newspaper and magazine articles, recording industry press releases and artist bios, correspondence, photographs, posters, flyers, advertising, and other documents.

 

Asia Art Archive

Cornell-only access to new content available through the Asia Art Archive.

AV Preservation Lab participates in Video QC software focus group

By Karl Fitzke

Olin Library’s Audio-Visual Preservation Lab (AVPL) was honored with an invitation to participate in a focus group aimed at identifying potential improvements to QCTools, a well respected suite of video quality control tools developed under the umbrella of the Bay Area Video Coalition (BAVC) in San Francisco’s Mission District.   In late April, the group of approximately 15 people, all doing similar kinds of work, came together from all over the country, using grant money secured by BAVC.  I was grateful Cornell granted me the time to participate.

On day one, the group reviewed current QCTools features and operation.  On day two, we got familiar with new batch file processing capabilities and discussed what other features we’d like to see.   Batch processing promises to help us take advantage of this tool set in our regular workflow, instead of the very selective use we’ve made to date, when some video artifact in a file seems rather suspicious and/or unfamiliar to us.

For folks who are unfamiliar with our work, we use our trained eyes and ears everyday, as a form of quality assurance, in identifying potential issues with analog-to-digital transfers of the audio-visual media that we process.  But it is not practical for us to listen to and/or watch every piece of program material in its entirely.  We only routinely check beginning, middle, and end of files for obvious problems.  So software packages like QCTools, able to automate the identification of a broad range of potential issues throughout a file, are very useful.  And the fact that they quantify what can otherwise be very qualitative judgements is also useful.

So with the support of others in the AV Preservation community, we’ll be training our minds on how to interpret QCTools data, and subsequently creating some objective QC standards for our work.  The aim will be to increase confidence in our results without wasting time running after false positives that result from poorly chosen thresholds on any of the characteristics we track.

We are glad to talk about the subject further with anyone interested.  Stop by sometime!

Long distance usability testing for arXiv

The arXiv development team has been working on a new interface for its volunteer moderators, in order to make their work easier and to decrease the workload for arXiv administrators. When interface development reached a point where we needed to test the result with real live users, we were faced with the interesting challenge of conducting the tests at a distance, as arXiv moderators are distributed all over the world. Thanks to the availability of web conferencing tools, this turned out to be a practical option.

Before we got started, we sought out the advice of Gaby Castro Gessner and Nick Cappadona on conducting usability tests remotely. Their advice included the following suggestions:

  • Conduct a technology test meeting to make sure testers have the conferencing application installed with working audio, know how to use the application, and in particular, know how to share their screen.
  • Send tasks to testers ahead of time via email, particularly for testers for whom English is not their first language, as the web conference chat function can be  cumbersome to use when a test is in progress.
  • Offer testers a “lifeline” – a phone number to call or other backup plan in case something goes wrong.
  • Send testers informed consent information in advance of the test.

How we did it

I was surprised at how much preparation was required to pull this off, including a lot of scheduling and drafting of correspondence:

  • Develop test, test and revise it
  • Select dates for technology tests and usability tests
  • Reserve meeting room for usability tests
  • Invite moderators to sign up for testing slots (Doodle poll)
  • Select testers and thank those who weren’t selected
  • Write and distribute technology requirements and zoom set-up instructions to testers
  • Schedule and set up zoom meetings for tech tests, conduct tests
  • Schedule and set up zoom meetings for usability tests
  • Recruit arXiv staff to take notes
  • Draft and share informed consent information with testers
  • Conduct usability tests
  • Debrief after tests and clean up notes
  • Produce a usable summary for follow up

With the help of Jim Entwood, arXiv Operations Manager, we designed a task-based test that required testers to try to complete the most common and important actions in the interface. Because the work of moderators is highly specialized and has its own distinct conventions and vocabulary, we needed to test the test with someone who “speaks” arXiv. Rebecca Goldweber, assistant arXiv administrator, ably helped us test the test and refine questions and tasks. Chloe McLaren took copious notes. And in the end, we conducted six usability tests, resulting in some very useful and specific feedback for improving the moderator interface.

Advice we’d share

The single most important thing is probably to allow plenty of time to develop the test, recruit testers, and make all the arrangements. From start to finish, the process took about one month. That might sound slow, but developing and testing the test takes time, as does writing invitations and explanatory emails, and corresponding with, recruiting and scheduling testers. I’m sure it will be faster next time, now that we have some experience as well as email text that we can reuse. We look forward to doing more of this as arXiv makes more user-facing changes and enhancements.

Thanks to the arXiv development team and DSPS UX staff for their work on the new moderation interface: Brandon Barker, Brian Caruso, Martin Lessmeister, and Melissa Wallace. And thanks to the arXiv moderators who volunteered and participated in the tests!

Digital Collections Promotions at CUL

By Marsha Taichman

As part of a Digital Scholarship and Preservation Services (DSPS) fellowship, I examine how the library promotes digital collections. For more information, see a blog post about my broader goals for the project.

At Cornell University Library, we excel at creating digital collections and refine our workflows on an ongoing basis to increase efficacy. We work hard to build these collections, but once they have been created, it is often difficult to focus on the promotion of finished collections, due to deadlines on subsequent projects. It occurred to me that we would benefit from a kind of protocol for post-production activities that we could use to guide collection promotion. This is not to say that each collection would go through the exact same process, but creating a checklist would be something that faculty members and curators could consult when launching a new collection.

At the end of this post, you will find a checklist that the library can use to promote collections.

There are many things to consider before building a digital collection and after it is created. Perhaps the most important questions to keep in mind are: Who are your users? How can you create digital objects, including their accompanying metadata, that support users? Designing to support users and making users aware that the collections are available is good practice.

With the help of DCAPS and Assessment and Communication, I came up with a checklist for digital collection promotion. It was my hope that the most recent Architecture, Art and Planning grant project related to Sri Lankan vernacular architecture would be completed before the end of this fellowship so that I could pilot the checklist with a collection that I had worked on, but the project is still in process.

The most consulted collection in the Digital Collections Portal (DCP) by far during 2016 fall semester (from September 1-December 2, 2016) was PJ Mode’s Persuasive Cartography. With this in mind, I reached out to Mode to try to understand why this might be the case.

We all know that collection promotion is important, Mode explained. As popular as the website is, he wishes that it was more widely known, but everyone is competing for clicks. Mode collected the maps that were digitized for this site for many years, and kept up a contact list of collectors, dealers and people who took interest in his collecting. When the website was launched, he wrote an email to all of his contacts (comprised of about 60 addresses), and wrote personal messages to others explaining the project. This meant that from the outset, many people were aware of the site, even if they weren’t following it closely.

Mode has received much promotion for his collection via blogs. He told me, “In every area of human endeavor, there are people who blog.” Accordingly, people who blog are always looking for new content. With this in mind, he tried to figure out the bloggers who are interested in maps. Once he identified these individuals, he would compose an introductory email and tell them about his website, including links in his message so that they would not need to look the site up. Virtually every time Mode sent an individual email, it resulted in a blog entry related to his site. The website Atlas Obscura published 7 stories on his maps, and National Geographic online had an article about how maps can be used as data. In the latter instance, the author, Geoff McGhee, found Mode’s maps by reading The Map as Persuader, on the website Big Think.

There are some blogs where Mode would like his work to be included, such as Musings on Maps, and he intends to approach writers when the next installment of the website is available and working smoothly (500 additional maps are being scanned and cataloged at present). He would like to make headway into the popular press with his next installment of maps on the Persuasive Cartographies website and have articles in The Wall Street Journal and The New York Times. Since the site launched in 2015, Mode has given 4 map-related talks in Princeton, New York City, Washington and Denver, and he uses these opportunities to discuss the website. He also looks for ways to have his site discussed in scholarly journals, such as The Portolan and The Journal of the Washington Map Society.

US Department of State, Illustrierte Karte der Vereinigten Staaten von Amerika mit Darstellung der regionalen Bodenschätze, Produkte und landschaftlichen Besonderheiten (Illustrated Map of the United States of America Showing the regional Natural Resources, Products and Features), 1958. Persuasive Cartography: The PJ Mode Collection.

In our discussions, Mode astutely noted that much of the success of promoting his maps has to do with the communities of interest, the involvement of the curator and the inherent interest of the collection, and that there is no way around the fact that some things have more visual appeal than others. He suggest that the library finds a champion for each collection and to have the champion figure out who would be interested in the content globally. This is something that DSPS is starting to do by assigning stewards within the library for each collection as it is being created. That said, we do not always have deep expertise in the collection areas that we are supporting. Perhaps it could be built into the library staff member’s role to not only research the content of the site but the people who would benefit from using it. It would also be advantageous to get library content into DPLA so that more people would happen upon it.

Talking to Mode about Persuasive Cartography was inspiring, and it raised many issues for me. The project is very much Mode’s own, and he invested a great deal of time and money into making the website a resource that is used. Often, the collections that we build in DSPS are grant projects, and are very closely related to faculty research and teaching. We don’t do much to hold these creators and curators accountable for collection promotion, though they are the ones who are shepherding these digital images into being. As part of the application process for Arts & Sciences and Architecture, Art and Planning grants, it would be useful to have applicants commit to collection promotion. If they could explicitly state where and how this potential collection could be used, the library could assist with making those connections. We could hire research assistants in different subject areas, as we have with students who do metadata entry for collections, and have these assistants find out which communities and professional organizations would benefit from these new collections and what kinds of outreach (such as emails, blog posts or articles) would be more productive in attracting audiences.

For what is the purpose of building a collection if it will not be used?

Michelangelo Caetani, Veduta Interna Dell’ Inferno, 1855. Persuasive Cartography: The PJ Mode Collection.

Checklist for Digital Collection Promotion

Consult with Assessment and Communication about a month prior to release of collection.  They will help you through the following steps:

  1.      Identify audiences likely to be interested in the collection
  2.      Craft effective message to highlight benefits of collection
  3.      Identify channels to reach relevant audiences
  4.      Consider ownership of collection and collaborate with owning college or department as appropriate
  5.      Consult on copyright issues as needed

Generally the most useful communication channels are:

Appropriate print and online news sources can be reached via Assessment and Communication, such as:

  • Cornell Chronicle
  • Cornell Daily Sun
  • Ezra Magazine
  • Discipline-specific publications
  • Special interest publications

What listservs should receive messages about the collection?  Reuse core messages created for new sources.

  • All Cornell collections should be publicized to CU-LIB as they are created
  • Announce to library liaison listserv
  • Enlist help from faculty members responsible for the collections and leverage their professional networks
  • Research discipline-specific listservs

Make sure that the project goes on the Cornell Digital Collections page:

Consider if promotional material (bookmarks, flyers, rack cards, etc.) is needed and distribute in relevant departments.

Social media – in addition to unit social media accounts, CUL social media accounts can push messaging out to larger audiences (again, consult Assessment and Communication).

Include links to collections in Wikipedia under relevant subject headings.

Blog post (a day in the life of a collection, a collection description, etc.).

Preserving one “Window on the Past” for the future

Cornell has recently deposited its Making of America (MOA) collection into HathiTrust.  Through this process, we have transitioned one of our oldest digital collections to contemporary architecture, improved the quality of both page images and optically recognized character (OCR) textflow, and made the content available to scholars using both traditional and computational methods.  Our experience illustrates how advances in technology over two decades can improve access to online content, and how migration of legacy content provides us natural points of opportunity to do just that.

About the collection: The Making of America Project was an early collaboration between Cornell and the University of Michigan that pioneered methods of digital preservation.   In 1995, with funding from the Andrew W. Mellon Foundation, the two institutions developed a  digital collection documenting American social history from the antebellum period through reconstruction. At Cornell, well over one thousand volumes were selected, scanned, and made available online as a collection in the repository named “Windows on the Past”, enabled by an architecture called DLXS.

In the intervening two decades, there have been a multitude of changes in the capabilities of repository architectures and the maturity of inter-institutional efforts.  As with all technology, DLXS entered sunset.  Its last release was in 2010, the same year that Cornell joined HathiTrust.  When considering how MOA would best persist, deposit to HathiTrust was especially attractive.  HathiTrust is a TRAC certified digital preservation repository, aligning well with Cornell’s commitment to this content.  Preservation costs for open material like MOA is underwritten by all members (see the HathiTrust Digital Library’s (HTDL) cost model). Items can be searched by bibliographic details as well as in full-text.  Further, the full-text indices are shared with the HathiTrust Research Center, for access by scholars using computational methods.  Deposit to HathiTrust enriches a publicly available common good, leverages new methods for scholarship, and gives the MOA “Windows into the Past” content a clear path into the future.  But although the gains are clear, the path was not.  Retrofit of legacy materials for contemporary repositories is a territory with some unique challenges that took creativity, diverse skills, and a fair amount of persistence to meet.

What is a volume? The problem of records.  The first challenge for us was the lack of any association or arrangement reflecting  physical volumes in the DLXS architecture.  HTDL requires deposits to be accompanied by item level bibliographic metadata.  When we deposit volumes, we create this metadata from our catalog, typically using a combination of title level and item level identifiers.  However, DLXS metadata lacked any such identifiers.  Additionally, the original books were dis-bounded before scanning, and discarded afterwards.  New pages were printed from the scans, but they weren’t always bound in the same enumeration as the old volumes.  This led to a fairly fluid relationship of the pages as seen in DLXS, and the actual bound volumes on our shelves.  Our mapping of metadata from our catalog, then, was a combination of automatic and manual processes, with careful resolution of some lingering troubleshooting where volume enumeration didn’t match up.  Along the way, we learned a tremendous amount about our catalog records including how unevenly we recorded our digital project information, and where all of the documentation is that enables a true harvest!

Improving Page Images.  As in any digitization project, the original process for page-by-page capture had a some hiccups.  Sometimes a page could be missed, or the image capture itself had quality issues.  Pages that could be improved upon were captured a second time.  We are certain that intentions were always to have these integrated into the MOA pages, but during migration, we discovered that this process was not completed.  The point of migration allowed us to insert these improvements where they belonged while we were restructuring the volumes for packaging.

OCR improvement. One of the frustrations some had of the original project was that the OCR did not faithfully reflect the textflow of the original pages.  Although every attempt was made in the original project to use the best OCR engines available, the simple truth is OCR of the mid 1990’s was not even close to the state of the art of OCR today.  OCR quality impacts search inside the book, and for the print disabled, the accessibility of the content itself.  Once again, leveraging the migration process as a point of opportunity, we were able to improve the OCR to modern standards.

Structural metadata. Users of online material don’t directly see structural metadata, but they use it all the time.  Structural metadata allows navigation through titles, volumes issues, and chapters, allowing the reader to easily find the page they are seeking.  DLXS already held good structural metadata from the original project, and we didn’t want to lose it.  At the same time, this information had to be translated into markup that the HTDL could use, and divided into the packages that reflected the separate physical volumes.  In the end, scripting came to our rescue, allowing us to reliably repurpose the old structural metadata at scale.

Package and upload.  The narrative above makes our experience sound like a smooth progression where all volumes moved through an assembly line from start to finish.  In actuality, we managed the deposit in three iterative phases, each of which took a progressively larger set of volumes through the whole process from start to finish.  This allowed us to “stop the line” between sets, learn from any failures and make necessary improvements in the assembly line itself.  The first set was quite small, about 20 volumes.  We made many mistakes, and as a result experienced a variety of failures during upload.  Working through corrections allowed us opportunity to improve our processes and also to anticipate scaling up things that were already working well.  The second set was about 100 volumes.  We experienced fewer failures, and made further adjustments.  The last set was over 1,000 volumes, deposited without incident. 

Gratitude. Work like this takes many hands and diverse skills.  Many thanks to George Kozak whose deep experience with DLXS and able scripting supported the work of page image improvements, structural metadata translation, and initial packaging.  Mira Basara’s skills in OCR allowed us to leap decades ahead in the quality of textflow.  Mira also managed much of the final packaging, and managed the majority of communication with HathiTrust.  Thanks are owed to Gary Branch, for his help in harvesting records.  A special thank you goes to Aaron Elkiss at HathiTrust for his patience with our first efforts to package at scale, and his advice on how to transform these into more successful efforts.  Finally, my thanks to practitioners of digital preservation, past, present and future; techniques and capabilities may change over the years, but our commitment does not!

« go backkeep looking »

Admin