Skip to main content

RepoExec, or The CUL Repository Executive Group

The Cornell University Library’s Repository Executive Group, colloquially known as RepoExec, has been meeting since the beginning of the year to explore and address the issues surrounding digital repositories at CUL. A digital repository is a system for managing and storing digital items, potentially including a wide range of content (visual images, digital images, research data files, AV, electronic records, code) for a variety of purposes and users. These repositories often include research outputs such as journal articles or research data, e-theses, e-learning objects and teaching materials, and administrative data. CUL supports several digital repositories to archive and/or provide access to a wide range of digital information such as eCommons, DigitalCommons@ILR, DigitalCommons@Law, CULAR, Luna Insight, Kaltura, Greenstone, Shared Shelf, etc.

As mentioned in the recent CUL IR white paper, the key impetus behind the formation of this group was to bring together the key players in this program area to strengthen communication and collaboration, and to foster the creation of an integrated and sustainable service framework. Therefore RepoExec draws its members from across the repository spectrum, including AULs, unit repository managers, and representatives from CUL IT and functional units such as collections and metadata.

The group’s full preliminary charge can be found here, but a good place to start would be the three goals that were set for us:

  • Reach out to stakeholders at CUL and Cornell at large to determine what is needed from repositories in terms of content, service needs, and sustainability.
  • Develop recommendations and/or scenarios by which CUL can meet those needs, addressing questions of software and architecture, workflow and staffing, and collection development.
  • Work with LibExec to develop an actionable plan to implement the recommendation(s) that best fit the needs for CUL.

Underlying all of those goals was the idea that the repository landscape at Cornell needed to be streamlined, both so that our services would be more useful to our users and also so we could maintain those services in a sustainable fashion.

Right from the start, I’ll say that our first several months didn’t see us addressing the first goal as well as we could have, though we did take strides towards the second and the third. This blog post will talk a bit about what we did, why we did it, and what we’re going to do next.

We determined early on that, in order to pursue the second and third goals above, we needed to have a more complete sense of the current digital repository landscape. So a small sub-committee was formed to put together an inventory of our existing repositories, the (somewhat surprising) results of which can be downloaded here as an Excel file or as a PDF.

We found more than twenty systems currently receiving at least some support within CUL that fit our definition of a digital repository. For each of those systems, we associated descriptive metadata in six categories:

  • General: Administrative unit, current CUL contact, current URL, etc.
  • Infrastructure: Software version in use, storage location, homegrown vs. off-the-shelf, etc.
  • Ingest: Submission policy, intellectual description of content, average frequency of deposit, etc.
  • Access/Discovery: content discoverability, availability of access descriptions, availability of embargoes, etc.
  • Content: Number of objects, optimized content types, etc.
  • Preservation: Redundancy, parity/bit checks, file versioning, etc.

If the breadth of the repository landscape is surprising to you, take heart: it was surprising to us, too. That’s a lot of systems, and a lot of apparent redundancy. But that’s a good thing to confirm, especially since the strategic goals of CUL indicate a strong need to streamline such systems. We needed to take what we have in the inventory and move it into the realm of actionable knowledge; our first attempt to do this involved expanding the metadata inventory into a broader schema of tags and categories that could be applied to repositories to a) identify those groups of repositories that were similar enough to be grouped together for policy purposes, b) identify the gaps and redundancies within those groups that could be addressed by new policy recommendations, and c) help connect people considering new repository projects with the existing options that would best suit their needs.

Unfortunately, after a few weeks of work, we discovered that in order for such a system to apply to the full range of repository options, it would either need to be so limited and general as to not be any more useful than the existing inventory, or so large and specific as to be completely unwieldy. However, the benefit of being a new committee is our flexibility: when we see something that’s not working, we can pull resources back from that, and apply them elsewhere.

Going forward, our goals for the rest of the year are to reassess our work and our charge, to see if we need to modify them to better meet the needs of CUL and its constituents. We have a number of ideas in the works for better engaging with our stakeholders, and using the work we’ve done to date to evaluate aspects of the repository landscape in more depth, and start making recommendations for where CUL’s repositories need to go next.

It’s been an exciting few months, and we’ve got plenty of work ahead! If you have any questions or feedback, please feel free to contact me, or stop by one of our open meetings. I’ll also be talking about RepoExec at the November R&O Forum, and hopefully more forums in the future.

Jim DelRosso
Chair, RepoExec
Digital Scholarship Fellow (September ’13 – August ’14)
Digital Projects Coordinator, HLM Library


Comments are closed.