It is with much enthusiasm that we begin digitization of the Kheel Center’s collection of Collective Bargaining Agreements. Over the next two years, Digital Consulting & Production Services (DCAPS) will digitize upwards of 2000 agreements, representing contracts from the American educational services and retail industries. The series selected range in length from two pages to two hundred, and span from the 1930s to the 1980s.
Barb Morley, digital archivist within Kheel, delivered the first shipment of Collective Bargaining Agreements to DCAPS for digitization in late July. This consisted of 2 linear feet of agreements from the educational sector– a mix of bound and unbound items, and color and black and white material. In order to digitize the material most efficiently, we established a workflow that identifies agreements that can be easily disbound for scanning with a Fujitsu sheetfed scanner. Items that cannot withstand disbinding will be scanned with a Zeutschel 10000TT overhead camera. So far we have scanned the first shipment and are almost through the second. All scanned agreements will undergo rigorous quality control inspection as well as optical character recognition (OCR) for search and discovery of the content within. Currently we are right on schedule, and expect things to progress smoothly.
Once digitized and online, these CBAs will provide unprecedented access and opportunity for historians, social scientists, and the general public to analyze the role of organized labor in America. The project is being funded by the National Historical Publications and Records Commission, an independent federal agency that preserves and shares records with the public. We are thrilled to be a part of this exciting and important initiative.
The DCAPS project team includes Bronwyn Mohlke, project management and quality control; Shakhya Bodhiwamsa, lead digitization technician; and Mira Basara, OCR specialist.
Coordinator, Digital Consulting & Production Services
In July-August 2014, Cornell University Library (CUL) and the Society for the Humanities co-sponsored a second year of its five-week summer fellowship program for graduate students in the humanities.
Piloted as an internship in Summer 2013, this program was inspired by the recognition that humanities graduate students at Cornell need additional opportunities to develop digital skills and knowledge that will be increasingly necessary in academic job markets. The Fellowship’s primary aim is to provide graduate students with the time and technical support to explore digital scholarship tools and platforms in ways that complement their own scholarly and pedagogical goals.
The program brings together a small cohort of graduate fellows for an intense 5-week fellowship period. Fellows spend approximately half their fellowship time in workshops and discussions; the other half they spend creating a small-scale digital project of their own, with inspiration, guidance, and technical support from Cornell faculty and CUL staff.
We were thrilled to receive nearly three times as many applications in 2014 as we received in 2013, and we are already planning to expand the program in 2015. Stay tuned for news of a pubic showcase of fellows’ work this Spring!
2014 Summer Fellows :
Jason Blaesig, Anthropology
Project: multimedia site in Scalar combining anthropological field recordings and translations of Peruvian folklore
Liz Blake, English
Project: topic modeling the text of James Joyce’s Ulysses
Kaylin Myers, Medieval Studies
Project: online compilation and interactive translation of Old English Body and Soul homilies
Jake Nabel, Classics
Project: compilation and translation of ancient Parthian inscriptions, including an online Parthian grammar
Mia Tootill, Musicology
Project: interactive map and collection in Omeka using Neatline to visualize the location of opera venues in 19th century Paris
Professor Timothy Murray, Director of the Society for the Humanities
Oya Y. Rieger, Associate University Librarian for Digital Scholarship and Preservation Services
Bonna Boettcher, Interim Director of Olin and Uris Libraries
Prof. Edward Baptist, History
Prof. David Mimno, Information Science
CUL Program Staff
Mickey Casad, Coordinator, CUL – DSPS
Virginia Cole, CUL – Olin/Uris
John Handel, CUL – DSPS
Michelle Paolillo, CUL – DSPS
…with many thanks to:
Jenn Colt, CUL – DSPS
Jason Kovari, CUL – LTS
Danielle Mericle, CUL – DSPS
Susette Newberry, CUL – Olin/Uris
Jaron Porciello, CUL – DSPS
Anne Sauer, CUL – RMC
Melissa Wallace, CUL – DSPS
Florio Arguillas, CISER
Patrick Graham, Academic Technologies
Patrice Prusko, Academic Technologies
In February of 2013, Cornell University Library in collaboration with the Society for the Humanities began a two-year project funded by the National Endowment for the Humanities (NEH) to preserve access to complex born-digital new media art objects. The project aims to develop a technical framework and associated tools to facilitate enduring access to interactive digital media art with a focus on artworks stored on hard drive, CD-ROM, and DVD-ROM. The ultimate goal is to create a preservation and access practice for complex digital assets that is based on a thorough and practical understanding of the characteristics of digital objects and requirements from the perspectives of collection curators and users alike. Digital content that is not used is prone to neglect and oversight. Reliable access mechanisms are essential to the ongoing usability of digital assets. However, no archival best practices yet exist for accessing and preserving complex born-digital materials. Given our emphasis on use and usability and our recognition that we must develop a framework that addresses the needs of future as well as current media art researchers, we developed a survey targeting researcher, artists, and curators to expand our understanding of user profiles and use cases. The purpose of this article is to summarize our key findings of the survey.
About the Project
Despite its “new” label, new media art has a rich 40-year history, making loss of cultural history an imminent risk. Experiencing a media artwork requires machines that are themselves vulnerable to technological obsolescence. This is especially true of digital art, which requires hardware and software support and is often stored in fragile formats. Although the NEH-funded project uses the Library’s Rose Goldsen Archive of New Media Art as a testbed, our ultimate goal is to create generalizable new media preservation and access practices that are applicable for different media environments and institutional types.
Named after the late Professor Rose Goldsen of Cornell University, a pioneering critic of the commercialization of mass media, the Goldsen Archive was founded in 2002 by Professor Timothy Murray (Director, Society for the Humanities, Cornell University) to house international art work produced on portable or web-based digital media. The archive has grown to achieve global recognition as a prominent collection of multimedia artworks that reflect aesthetic developments in cinema, video, installation, photography, and sound. We estimate that about 70 percent of CD-ROM artworks in the Goldsen collection already cannot be accessed without a specialized computer terminal that runs obsolete software and operating systems. Because of the fragility of storage media like optical discs, physical damage is also a serious danger for the Goldsen’s artworks on CD-ROM and DVD-ROM, many of which are irreplaceable. Even migrating the information files to another storage medium is not enough to preserve their most important cultural content. Interactive digital assets are far more complex to preserve and manage than single, uniform digital media files. A single interactive work can comprise an entire range of digital objects, including files in different types and formats, applications to coordinate the files, and operating systems to run the applications. If any part of this complex system fails, the entire asset can become unreadable.
In January 2014, we announced the questionnaire on several preservation, art, and digital humanities mailing lists. We had a total of 170 responses, 122 of them responding as an individual researcher or practitioner and 48 responding on behalf of an archive, museum, or a cultural heritage institution. Out of 170 respondents, 80 fully and 32 partially completed the survey and 58 of them only took a quick look without responding. We are not sure if the incomplete survey rate is due to time limitations of the respondents or indicates unfamiliarity with the program area. We did not observe any significant differences in the responses of these two groups (personal and institutional responses), probably due to the fact that even at an institutional level, new media projects and collections are led by small teams or sometimes individuals. Respondents held multiple roles and characterized themselves as artists (48%), researchers (47% researchers), educators (25%), and curators (20%). Almost 24% identified themselves as archivists, conservators, project managers, digitization specialists, or technical developers. The scope of digital media art collections they worked with was also broad, including digital installations, digital video and image, interactive multimedia, raw audio files, born digital artwork, 3-D, video art, and websites. Genres emphasized in their media art research included installation/performance/media sculpture, video/cinema, and interactive artists portfolios. Respondents were interested in several platforms, the most common ones being personal computers/devices, locative media installation/sculpture performance, web-based art works, and hardware peripherals. Among the countries represented were the US, Germany, France, UK, Australia, and Argentina.
We posed an open-ended question to inquire about the research questions that guided respondents’ interactions with media works. It is difficult to characterize or summarize their broad range of involvements, as the research frameworks referenced were almost equally distributed among the contextual categories of artistic, social, historical, cultural, aesthetic, and technical. However, some of the noteworthy research angles mentioned in the responses included:
- Social change – how technologies are assisting exploration of political stories, strategies to mitigate problems of born-digital to work towards a system of advocacy and lobbying, implications of social identity (for example, gender) in digital media artworks
- Digital divide – accessibility of digital art for individuals with lower socioeconomic backgrounds and artists’ role in reaching out to a diverse population
- Role of technologies in supporting and stimulating community and researcher engagement, presentation of news and actual events through art
- Interpretation of artists’ intentions – what is being communicated through the artwork, interactive power of technologies, imaging future use – e.g., how will the art object be used/viewed in 20 years?
- Historical perspectives – how certain technologies have been used in art, evidence of art-science collaboration – synergy
- Affordances of digital media and digital spaces – if and how digital works explore something further than the analog approaches, embodied and social user interactions.
- User-response oriented analysis – the role of viewers’ background in interpreting digital art work and interactive narratives, effects of image and sound on audiences, social and political effects of technology.
- Characteristics of influential artworks & relation of historic art work to present-day questions, searching for works of art for classroom teaching
- Long-term preservation challenges and requirements for retrieval and documentation of digital art works for research and learning from users’ perspective. Sustainability of digital content and role of crowdsourcing
- Device requirements for accessing and experiencing the artwork – role of viewing environments (e.g., if an artwork is meant to be seen on an old TV set)
- Authenticity and documentation: How can documentation capture the essence of highly interactive works, for instance live performances?
Respondents cited a number of serious impediments encountered in conducting research involving new media art. These impediments were technical, institutional, and cultural in nature. For example, respondents mentioned lack of documentation, technological challenges such as migration and emulation, costs and lack of understanding of costs, legal issues and access limitations, missing connections between similar archives (lack of unified discovery & access), digital divide, insufficient metadata, and hardware and software dependencies. Several of the respondents expressed their unease about the disappearing web-based art and ubiquitous broken links. One respondent noted, “In a society that is rushing headlong into the future, it is vital that we preserve the efforts of those who have early works in this new culture.” One of the respondents pointed out that due to a general “disinterest in preserving the cultural artifacts of the digital age,” there was a lack of understanding of the importance of these objects for cultural history. Another comment was about the infrequent access requests and therefore difficulties in justifying investment in preservation efforts for future use.
The respondents who use new media collections in support of teaching and learning listed several impediments such as vanishing webpages, link rot, poor indexing, gap for works from the 80s and 90s, and the lack of quality documentation. One of the respondents wrote, “Some work becomes very easy to make when the technology evolves and the students don’t understand how it was important, or how it was a challenge to produce at the time.” This statement underscores the importance of documenting cultural context to situate the work from artistic, historic, and technical perspectives
We inquired about respondents’ documentation needs and preferred strategies in cases where full interactive access is not possible. Again, there were several suggestions:
- Providing textual description of content
- Capturing video documentation of use such as walkthrough video with voiceover
- Recording audience perspectives, interpretations, and reactions
- Maintaining artists’ notes
- Offering blogs such as the British Library’s Endangered Archives Programme to build awareness about the threats to digital archives
- Describing the technology in context to its time to understand and appreciate the available technological and artistic tools
- Collecting contextual materials – exhibition announcements, brochures, resumes, etc.
- Capturing metadata including MANS (Media Art Notation System), OAIS, PREMIS, TOTEM (the Trustworthy Online Technical Environment Metadata Registry)
When we asked the respondents about the preservation measures undertaken for their own art work, again we again received a combination of different strategies. Some were common ones such as archiving hard drives, keeping backups of software, and maintaining redundant storage. They also mentioned maintaining a blog with information about the art work, web publishing for open and broad access, videotaping user interactions, taking screen shots, and creating short videos about the work. Several respondents made reference to the fact that some of their early works no longer existed or worked.
For practicing artists, there were several concerns about longevity of their creative expressions. One individual expressed doubts about the inability to sell works due to the fact that they may become obsolete within a year. They worried that it was difficult to archive immersive installations, interactive Flash pieces, and work with dependency on external files. They also mentioned copyright issues as a significant impediment. There were also several comments such as the following ones articulating anxiety over future use:
[My work] will stay forever in storage and will never be re-activated.
I am worried about context and artistic intent – how do we retain authenticity in the long term?
The question about which archiving and access practices affected respondents the most in their creative and professional work also generated thoughtful responses. Here are some examples:
Access to past works are incredibly valuable to me - understanding works not just for their message but also for their technical [aspects] help new media artist evolve the area of practice.
I think museums tend to see my books as a treasure when they were created to be used.
Knowing where artworks and their documentation are kept. Individual sites that do not often appear very high up in search engine results.
What is complex media object? If it is performed or presented, it can be power point or a photo essay.
For curators, the following comments illustrate the biggest concerns:
Probably the biggest impact is in teaching. One is continually trying to explain a work that one has seen in the past without the ability to actually show it.
I know [the art works] will become obsolete as running objects so the best thing I can do is push as much data about them out onto the Internet as possible.
Allowing original context of the artwork in the audience experience
Only twenty-four of the respondents indicated that their institutions include born-digital interactive media artworks and artifacts in its holdings. Several of the respondents indicated that they don’t include born-digital interactive media in their holdings because such materials fall outside of collection scope. In some cases, they noted that procedures for providing access are too complex or unsustainable, or cited technological challenges and lack of local support.
Twenty respondents answered the access and preservation related questions on behalf of an archive, museum, or a cultural institution. Only one organization mentioned having a sophisticated and integrated web-based discovery, access, and preservation framework. The others indicated that access needed to be arranged through a special arrangement such as setting an appointment. They indicated that a full range of users are supported – students, faculty, researchers, artists, hobbyists, and general public such as museum visitors. They mentioned a range of preservation strategies they rely on including migration, creation of search and discovery metadata, maintaining a media preservation lab, providing climate control storage, collecting documentation from the artists. They named several challenges to preservation, many stemming from lack of resources or difficulties associated with executing artist interviews. The conservation measures were sometimes triggered by exhibition plans and some indicated that they were working on clarifying policies. They also noted that the measures taken to secure access, preservation, migration rights varied from case to case.
The data we have gathered further strengthened our opinion that identifying the most significant properties of individual media artworks will require direct input from artists. This confirms our belief that we need to push the integration of archival protocols as far upstream as possible, to the point of content creation and initial curation. We plan to adapt pre-existing conservation-oriented questionnaires to our emerging data model and our growing sense of media art “classes” with distinct preservation and access needs. We plan to solicit the contributions of artists in the test collection for this specific NEH-supported project. We will simultaneously revisit our rights agreements with the artists, which never anticipated access strategies based on emulation.
A reoccurring theme in our findings involved the difficulties associated with capturing sufficient information about a digital art object to enable an authentic user experience. This challenge cannot and should not be reduced to the goal of providing a technically accurate rendering of an artwork’s content. So much of new media works’ cultural meaning derives from the users’ spontaneous and contextual interactions with the art objects. Reproduction of an artwork’s digital files does not always ensure preservation of its most important cultural content. It is essential that we anticipate the needs of future researchers and acknowledge the core experiences that need to be captured to preserve these artifacts. For a work to be understood and appreciated, it is essential to relay a cultural and technologies framework for interpretation. Some works that come across mundane now may have been highly innovative trailblazers of yesterday. Given the speed of technological advances, it will be essential to capture these historical moments to help future users understand and appreciate such creative works.
The preservation model to be developed will apply not only to new media artworks but to other digital media environments. Therefore we are hoping that this project will inform digital preservation services at libraries, archives, and museums to support future uses in learning, teaching, research and creative expression by scholars and students. We will further elaborate our findings in a future article. Stay tuned!
Oya & Mickey
On behalf of the project team:
Timothy Murray & Oya Rieger (co-PIs), Mickey Casad (Project Manager), Dianne Dietrich, Desiree Alexander, Jason Kovari, Danielle Mericle, Liz Muller, Michelle Paolillo, & AudioVisual Preservation Solutions
In an audiovisual preservation workflow, there is a bit of a wormhole effect to each decision you make. For instance, when choosing whether to accommodate the digitization of a new format in-house, one must consider long-term support for the equipment involved, including cleaning, maintenance, tools and supplies, as well as technical expertise. All of these things add up to two critical things: time and money.
In a preservation workflow, it’s not always as simple as hooking a VCR (assuming you still have one) up to a computer. You might think: If you need a DVD copy of something or a CD dub of an old recording, then that’s fairly easy given today’s technology, right? As with the conservation of a manuscript or painting, there are general requirements and standards widely accepted and used by the preservation community when trying to digitally preserve unique AV content. Instead of scanning an image at high-resolution, rebinding a brittle book, or making a squeeze more interpretable, you’re trying to capture an electric signal. The quality of that signal can make all the difference and consumer grade electronics weren’t designed to produce a broadcast-grade signal.
It first requires routinely tested, cleaned, professional and broadcast-grade equipment. This equipment is not cheap and the cost of the items, parts, and repair expertise is rapidly increasing due primarily to format obsolescence. Legacy formats like magnetic tape are machine-readable and completely reliant on the proper technology to be interpreted. Most of the major manufacturers of magnetic media and the devices needed to play it have disappeared. Next, one must consider the quality of the analog to digital converter. Organizations like The International Association of Sound and Audiovisual Archives(IASA) and the Audio Engineering Society (AES) have set guidelines for the quality of analog–to-digital converters for audio preservation. Standards for video (RF) conversion and digitization are still being worked on by various organizations and institutions. Finally, you need a computer capable of 10-bit (or higher) capture to a linear-based editing software platform. As you can see, deciding to digitize a format in-house is a big one.
We have been carefully considering the formats we want to support in the digitization lab here at CUL. We want to address the major formats contained in our vast collections, balancing what we can achieve on-site with the services available through our vendor partners.
In the CUL AV preservation lab, we can now handle the following formats:
-Vinyl LP (33rpm)
-1/4” open reel audio (Stereo and 4-Track)
-all formats of non-tape based, digital-born audio and video
Another huge piece of the puzzle is maintaining the integrity of digital content. After something has been digitized, the work is not finished, as many institutions have learned. There are requirements to maintaining the integrity of this data. Audiovisual data could be lost or compromised due to data loss, bit rot, bit-level corruption, hardware failure and other kinds of data problems. We maintain integrity by utilizing fixity checks in the form of data comparisons on a weekly basis on both of our digitization stations. This helps keep track of additions and changes to files in order to recognize digital problems and errors within our current projects. I hope that we can soon be pushing these preservation master files into CULAR, as we’re running out of space on our 24TB machines. That brings me to the biggest hurdle at this point: video master files are huge. 60 minutes of SD video footage at 10-bit resolution ends up producing a roughly 100GB file.
I knew this coming in, but it’s become clear that deciding what formats to handle in-house and knowing when to outsource is crucial. I have developed close, working relationships with our vendors in order to minimize cost and effort while meeting our format and metadata needs. IT requirements are growing across the library and that is part of the cost burden we have to be mindful of. Danielle Mericle and I are working closely on how to adequately meet CUL’s audiovisual preservation needs while not over-extending our budget, scope, and the library’s general requirements. This is a big charge, but I’m happy to report that we’ve made huge strides toward meeting this challenge.
We are pleased to announce that the Cornell AudioVisual Preservation team is launching a campus-wide census which will gather important data regarding our ‘at-risk’ AV formats. We will assess condition, format stability, uniqueness, and scholarly value. This is an important first step in developing a more comprehensive preservation strategy. This pilot initiative is being jointly funded by Cornell University Library and CIT.
The challenges associated with audio-visual (AV) media preservation are significant: important scholarly material is at risk due to physical media degradation; metadata loss; player or format obsolescence; and rights issues. In addition to issues with legacy materials, there is mounting pressure on newly generated AV content, as scholars are now creating large-scale AV collections associated with their research projects. With new government regulations requiring data management plans for all grant-funded initiatives, it is imperative we begin to address the long and short-term needs to preserve and provide access to important audio-visual collections.
In many cases, we may not even be aware of high-value content on campus, as it lives outside of the normal avenues for collection development and maintenance (such as the Library). Instead such material is embedded in departments, in shoe boxes under desks, or as digital files living on isolated desktops. Even within the Library system there are collections that are not sufficiently described or preserved. Hence we are undertaking the census work as a first step to get a handle on the scope of the problem.
Our initial effort will be in the form of a web-survey, with scheduled follow-up site visits from representatives from our team. We hope to have wide-spread input from key stakeholders across campus, and encourage you to share this with your colleagues. If you have any questions, do not hesitate to contact us at firstname.lastname@example.org. For more information about our group and charge, go to: https://confluence.cornell.edu/display/CAV/Home
(by Peter Hirtle)
Of all the absurdities associated with the Authors Guild suit against Google over the Google Books Project, perhaps the greatest was the Guild’s efforts to make it a class action, with the 8,000 members of the Guild speaking for all authors everywhere in the world. Most academic authors realize that providing a keyword index to all published literature can only aid scholarship. At the same time, by making it easier to identify works that might be of interest, Google Books can only increase readership and sales of the original works. Yet at the time of the lawsuit, there was no organization that could speak for authors motivated by concerns that were not solely commercial.
Now there is. On 21 May, the Authors Alliance was formally launched in San Francisco. The Alliance is the brainchild of Pamela Samuelson, one of the foremost copyright experts in the country and an active voice in the Google Books cases. The Alliance recognizes that the primary motivation for most authors, including many academic authors, is to be read. Digital network technologies present unprecedented opportunities for the creation and distribution of creative works for the public good. Alliance members are not opposed to authors making money from their works; most of the members of its Advisory Board publish with trade publishers and have works that can only be purchased. But they recognize that there are some educational uses (including indexing) that do not need to be monetized. The Alliance will be a voice for moderation.
This is an especially auspicious time for the formation of the organization. Discussions have started in Congress about reforming copyright law. What has been a trade regulation for print media no longer works in a digital environment that exists on copying. A different ethos is needed if copyright is to meet its constitutional mandate “to promote the progress of science and useful arts.” The Authors Alliance has therefore developed a set of “Principles and Proposals for Copyright Reform” that reflect the interest of authors who write to be read and that will broaden the discussion in Washington.
This is why I was happy to become a Founding Member of the Authors Alliance and make a donation to its work. I would encourage anyone who is an author (of books or articles or any creative work) to look at the Alliance’s mission statement and goals and to consider joining as well. And if you don’t believe me, see Kevin Smith’s excellent post, “Why I joined the Authors Alliance.”
DCAPS is pleased to announce these recently launched digital collections:
The Mnemosyne Atlas explores the complex work of the 20th century scholar, Aby Warburg. A collaboration between DCAPS, the Cornell University Press, the Warburg Institute, and the German Studies Department, the site is a digital corollary to the CU Press publication, Memory, Metaphor, and Aby Warburg’s Atlas of Images, by UCLA professor Christopher D. Johnson. Drawing off of Johnson’s extensive knowledge and research into the Warburg Atlas, the site allows users to interact with Warburg’s panels independently or with the guidance of scholars (Johnson has the first “ pathway” mapped, with more coming soon). The site was developed within DCAPS, and funded by the Arts & Sciences grants program as well as the Mellon Foundation. The development team from Cornell Library consisted of Kizer Walker, Director for Collection Development; Manolo Bevia, Lead Designer; Jen Colt & Melissa Wallace, Designers; Jim Reidy, Programmer; Jason Kovari, Metadata Lead; and Danielle Mericle, Project Manager.
This website is home to a number of collections revolving around the Ancient Mediterranean, including the Cornell Cast Collection, the Cornell Coin Collection, and the soon-to-be-developed Monumentum Ancyranum Squeeze Collection, and AD White Gem & Amulet Collection. Digitized from the analog collections dispersed across campus, the website pulls together these resources into an integrated platform so that users may discover these rare and precious resources in a unified fashion. Funding for the site was provided by the College of Arts & Sciences Faculty Grants Program, and multiple units, faculty members, and graduate students contributed to the creation of the digital collections. The development team from DCAPS consisted of Rhea Garen, digitization; Jason Kovari and Hannah Marshall, Metadata Design; Melissa Wallace, Designer; and Danielle Mericle, Project Management.
This fascinating site was a collaboration between the Department of Molecular Biology, the Racker family, and DCAPS. Originally a small effort to digitize a handful of VHS tapes from the Racker Lecture Series, the project quickly grew into a full-blown website with accompanying lectures (most of which are fully transcribed and searchable); historical information about Efraim Racker, and access to his extensive artistic output (in the form of journals and paintings). The DCAPS team consisted of Melissa Wallace, Designer; James Reidy, Programmer; Mira Basara, Luna Programmer and PDF support; Tre Berney and Madelaine Casad, Video digitization; Jason Kovari, Metadata; and Danielle Mericle, project manager.
A new workflow has made it possible for us to attend to quality issues within our Google-digitized books. As reported a few years ago, Cornell digitizes books by the thousands with our partners at Google, and the resulting digital books are deposited into the HathiTrust Digital Library. Google has many methods to maintain and even improve the quality of individual page scans it makes, but occasionally something goes awry. The vast majority of the time the errors are detected and corrected before the book shows up in Google Books and before it is released for the ingest into HathiTrust, but occasionally errors are missed. (The article by Kenneth Goldsmith “The Art of Google Books Scans” might serve as sampler of various types of things that can go amiss with scanned images: everything from images taken while pages are still moving, to the capture of the hands that are, quite literally, in the process of making our Google-digitized books.) Processes to correct our Google-digitized pages have long been cumbersome, requiring extensive decoding and analysis on the part of staff at Cornell, and so were considered not worth the disproportionate resources they required. The effective result has been that over the past three years I have been collecting reports of images in need of correction from HathiTrust that I could do little more for than apologetically acknowledge.
However, recent changes at Google have tipped the balance of resources required to engage the image correction process. Google has provided a web form for library partners to create an easier way to engage corrections. More importantly, Google now has staff resources that perform much of the interpretation required to appropriately name the pages for insertion, appreciably lowering the barrier for our participation in the process. There were still some workings on our end to figure out how to engage local staff expertise at the Digital Management Group (DMG) for scanning, while keeping our internal process as simple and easy as possible. There is also still plenty for me to manage – coordinating across three systems (Voyager, Google and HathiTrust) to make sure we all correct the right pages from the right book. But as we practice in our initial tentative experiments (we have had five to date, and all of them have been successful) we are learning how to cross reference our communications with each other to make this easier. It is important to note that the successful process is due to this large cooperative effort that includes staff at Google, HathiTrust, and Cornell. (Here I note a special thanks to Danielle Mericle, who is contributing DMG resources to this effort, and Bronwyn Mohlke who is contributing her scanning expertise.) Together, we have all begun to chip away at the backlog of HathiTrust tickets reporting images for correction, improving those pages and closing those tickets, one by one.
If you notice pages in HathiTrust that need improvement, please use the feedback link in the footer of the page in the HathiTrust interface. This automatically opens a form that will ask for information helpful to resolving the problem. Submission of the form opens a tracking ticket, and often HathiTrust can resolve these issues with Google directly. When necessary, HathiTrust staff will escalate to the appropriate library partner for the correction process.
Project Euclid launched a new website in early January. Planning, design, and development of the site took place over 18 months, and included discussions with numerous researchers, publishers, and librarians. The new site maintains what people liked about the previous site, its performance and clean appearance, while incorporating many new tools and features.
Project Euclid is a not-for-profit, academically owned and operated initiative that provides electronic hosting, marketing, and sales services to publishers of mathematics and statistics literature. CUL conceived of and began developing Project Euclid in 1999-2000, and the Library ran the system for eight years on its own. Since 2008, Project Euclid has been jointly operated by Cornell University Library and Duke University Press. CUL maintains the technical infrastructure, and DUP handles customer-facing services like marketing and sales. Project Euclid currently hosts some 80 international publications, most of which are journals but with a growing book and conference series component. The initiative is completely self-supporting.
Project Euclid benefits greatly from other CUL activities, and this was especially true of the site redesign. Melissa Wallace, a DSPS web designer, designed the new site and brought to it many of the design considerations learned from CUL’s Discovery and Access efforts over the last several years. Shinwoo Kim and Martin Lessmeister, CUL-IT developers, helped implement new technologies in Euclid that are also being incorporated into the new Library discovery environment, as well as lessons learned from running arXiv.org. Both Martin and Shinwoo also bring along years of experience with the system behind Euclid.
What are some of the new features of the redesigned Project Euclid website?
A new search and discovery tool. With this version of the website, Project Euclid introduced faceted searching. While research libraries are increasingly implementing this technology, it is still relatively new to academic content sites. Its use in Project Euclid represents a powerful new tool for navigating over 1.7 million pages of scholarship.
Table of contents (TOC) alerts. Project Euclid has offered RSS alerting services for several years, but users can now register to receive an email with the table of contents of a journal issue when that issue appears on the website. Users manage their TOC alerts through personal MyEuclid accounts, easily created by anyone.
Citation export. On TOC and search results pages, users may select one or more articles and export citation information in BibTeX, RIS, or a printer-ready format. RIS is a bibliographic format that many citation management tools are capable of importing (e.g., EndNote, RefWorks, Zotero, and others).
Mobile interface. The new website implements responsive design, which automatically optimizes the site appropriately for a variety of mobile devices. This allows users to read and work on Project Euclid more comfortably and effectively as they access content from multiple devices.
Top downloads. We now display a list of top downloaded documents on a number of Project Euclid pages. The lists are calculated in the same manner throughout the site: the top five downloaded articles or chapters over the previous seven days. These lists attempt to give some sense of user download trends and are recalculated every day. Any particular list displayed is appropriate for the viewer’s context within the website. On a journal home page, the top downloads are all from that journal. On a publisher’s page, they could be from any publication of that publisher. On the Project Euclid home page, the top downloads are measured across all content in the system.
“More like this.” When a user views an article or chapter page, Project Euclid presents a list of similar documents, using the viewed item as the basis for evaluating similarity.
Other changes and added features to the new Project Euclid website include: wider implementation of MathJax (a display technology for mathematical expressions); branded publisher landing pages; links to social media; access indicators for all content; extended print-on-demand offerings; and remote login (via Shibboleth), for off-campus access.
Project Euclid welcomes your feedback on the new site and suggestions for further improvement. All pages have a “site feedback” link in the lower right.
Faculty from across the College of Arts and Sciences braved frigid temperatures on Feb. 27 to attend the first reception in support of the Grants Program for Digital Collections.
The program — now in its fifth year —has funded more than 20 incredible projects Funded by the College of Arts of Sciences and coordinated by Cornell University Library, the Grants Program for Digital Collections in Arts and Sciences aims to support collaborative and creative use of resources through the creation of digital content of enduring value to the Cornell community and scholarship.
Here in the Library, we know that digital collections are powerful. They remove barriers of access to unique, previously unavailable material to aid scholars and students alike in research exploration and the joy of unearthing interdisciplinary connections.
Gretchen Ritter, the Harold Tanner Dean of the College of Arts and Sciences, warmly opened the reception at the History of Art Gallery, describing the strong interdisciplinary collaborations that the grants program fosters. And Anne Kenney, Carl A. Kroch University Librarian, noted the importance that access to curated digital collections offers to students, researchers and faculty worldwide.
Annetta Alexandridis and Cheryl Finley — both previous grant awardees and History of Art faculty members — presented as well. Dr. Alexandridis, who was awarded a 2010 grant to photograph deconstructed plaster casts from the Cornell Plaster casts collection, noted that access to those digital images has provided her students with investigative research possibilities that have spurred hands-on opportunities to reconstruct plaster casts. Although technically “copies,” many of the casts represent the most authoritative version now available, the original having been destroyed by war or poor environmental conditions. These materials are frequently on display in the History of Art gallery.
Dr. Finley, whose 2012 grant provided support to digitize images from the Lowentheil Collection of African-American Photographs, described the importance this rare collection has for scholars and students of visual arts everywhere, but particularly students and scholars of African American and American studies. The kinds of images represented in this collection – including portraits of known and unknown sitters, landscapes of the antebellum and postbellum south, brutal images of racial torture and domination, documents of civil rights protest, portraits of black leaders, writers and intellectuals, and images of everyday African American life – reveal volumes about black life and struggle in uncommonly rare photographs. Having these available in digital form will impact learning and teaching worldwide and across a wide range of disciplines.
Grant applications for this year are due on March 15. For more information, including the grant application proposal, please visit http://dcaps.library.cornell.edu/initiatives/asgrants/apply.
keep looking »