SHERPA DP: Creating A Persistent Preservation Environment For Institutional Repositories
SHERPA DP is a two year project funded by the JISC. The purpose of this project is to create a collaborative, shared preservation environment for the SHERPA project http://www.sherpa.ac.uk/ framed around the OAIS Reference Model.
The project brings together the SHERPA institutional repository systems with the preservation repository established by the Arts and Humanities Data Service to create an environment that fully addresses all the requirements of the different phases within the life cycle of digital information.
An introduction PowerPoint presentation, Preserving E-Prints: Scaling the Preservation Mountain (130kb, PPT), by Sheila Anderson (AHDS) and Stephen Pinfield (University of Nottingham) is now available
An interview on the project with Andrew Wilson (Sherpa-DP Project Manager) is available on the Digital Preservation Coalition website.
More information: Deliverables | Description | Further Details | Initial Documents
Description
Institutional repositories are a new and high profile area, often feted as the future for disclosing to a wider public the research outputs of Higher Education. In recognition of this JISC has funded the establishment of a number of institutional repositories in the UK as part of the FAIR Programme.
Thus far, the initial focus of activity has been on the process of establishing repositories - installing appropriate software and establishing policies and procedures; encouraging deposit of articles and dealing with the associated rights issues; and working to effect the cultural change needed for successful development and population of repositories.
Given the experimental and project-based nature of much of this activity, it is not surprising that less attention has been paid to preservation, and that no repository thus far established would claim to be 'doing preservation'. Of course the SHERPA project has a specific remit to investigate the requirements for preservation and is producing some valuable outputs, but this falls far short of establishing a coherent and long-term preservation environment for the repositories involved in the project.
The recent JISC-funded Feasibility and Requirements Study for Preservation of E-Prints (James et al, 2003) argued that there is a unique window of opportunity to address the preservation requirements of repositories at the beginning of their adoption rather than leaving it until the lack of preservation management becomes an issue and content is no longer accessible.
A key recommendation of the report was the establishment of a repository infrastructure based upon the OAIS reference model. It further recommended that this should be "a practical study that includes implementation at one or more repositories and their partners as appropriate to the chosen organisational model' (p.56, James et al, 2003)
Furthermore, the Feasibility and Requirements Study identified the diverse range of skills and expertise required to manage and run a preservation environment based upon the OAIS Reference Model. In particular it noted the scarcity of staff and services with practical digital preservation skills and expertise.
It therefore suggested that a sensible way forward would be to look to disaggregate the functions and activities identified in the OAIS Model, and to seek collaborative arrangements between repositories and specialist services with each taking responsibility for different functions.
This recommendation fits well with that envisaged in the JISC Continuing Access and Digital Preservation Strategy 2002-2005. Beagrie writes:
"Institutional arrangements ………. may benefit most from third-party or common services being developed to support preservation planning or remote storage." (p.13)
We therefore propose to create a collaborative, shared preservation environment for the SHERPA project framed around the OAIS Reference Model. The model we are proposing is intended to take advantage of the pre-existing and successful collaboration between the SHERPA repositories and the Arts and Humanities Data Service.
The exact nature of the model to be adopted will be established at the start of the project, but at this early stage it is envisaged that the AHDS will provide a shared preservation store, and undertake preservation planning and preservation functions, whilst the SHERPA repositories will continue their work to raise awareness and promote deposit of content, ingest, storage of content for delivery, and access.
To that end we are able to use the simplified model presented in the JISC Preservation Strategy to visualise how the model might look. The top layer would be the continuing responsibility of the institutional repositories, with this project adding the bottom layer, with the addition of preservation actions, through the collaboration with the AHDS.
Institutional Repository Layer (SHERPA Repositories)

Preservation Service Layer (Arts and Humanities Data Service)
Further specification of this model will use the OAIS Reference Model to define a functional model, including assigning rights and responsibilities for the different functions identified in OAIS, and to create the protocols and processes necessary for the implementation of a successful preservation environment.
The project is especially interested in exploring the use of open source software and standards to implement the preservation environment, including METS and grid technologies.
Within this model each party will be required to provide an agreed level of functionality in order to ensure successful coordination and interoperability between the parties. Repositories are likely to provide the following functionality:
- Support for publishing metadata to be harvested
- One or more methods for transferring data (e-print content) across the network
- Alerting mechanisms for updated/additional content
The Preservation Service is likely to provide (or provide in collaboration with other preservation services e.g. the DCC) the following functionality:
- Support for harvesting metadata
- Support for harvesting data
- File format conversion tools
- File integrity checking tools
- Preservation metadata extraction tools
- File format obsolescence checking
- Alerting and migration service
- One or more methods for transferring data and metadata back into an institutional repositories
The challenge will be to do this successfully with the different repository software solutions chosen by SHERPA partners and taking into account the individual policies and approaches with regard to content and metadata. To this end the project will be investigating a variety of approaches for interoperating between the institutional repositories and the preservation repository services, and will test and evaluate each during the first phase of the project.
A key part of the project will be to add functionality and to extend the storage layer of the Eprints and DSPace repository software applications to enable the necessary preservation actions to take place. Following the testing and evaluation process the chosen solutions will be implemented. It may well be the case that different solutions are chosen for different institutions and for different repository software.
Top
Further Details
The collaborative model proposed will take advantage of the skills and expertise developed by the SHERPA development partners which includes the preservation expertise of the Arts and Humanities Data Service (AHDS). By extending this collaboration into a full preservation service the project removes from each individual institutional repository the burden of adding a preservation layer to their repository, and the need for them to seek to employ scarce preservation management skills and expertise. The project will investigate the business case for this model and seek to establish an economic cost model that could be used to ensure its long-term sustainability.
Establishment of the preservation environment will include investigation of the technical challenges, metadata requirements, administrative and workflow processes, and will encompass these within the OAIS reference Model. This will provide a rich set of reports for others to use, and a practical implementation of a preservation environment for SHERPA. The model and working processes that the project will develop and implement is intended to be transferable to other repositories and services; and would be available for other institutional repositories to join in the future.
In summary the Project will:
- Use the OAIS reference model to develop a persistent preservation environment for the SHERPA consortium, assigning rights and responsibilities and establishing protocols and work flow processes that will ensure the long-term preservation of the repository content.
- Explore the use of METS as the framework for packaging and transferring metadata held within the institutional repositories, including the preservation metadata created by the preservation service.
- Establish a coordinated set of protocols and software to be implemented as a working preservation service for a group of institutional repositories.
- Explore the use of open source software and tools to add functionality to and extend the storage layer of repository software applications.
- Draw together the experience gained into a Digital Preservation User Guide that will complement the 'The Preservation Management of Digital Material Handbook' created by Maggie Jones and Neil Beagrie, and act as a practical user guide to implementing this type of preservation environment
The proposal will achieve several key objectives outlined in the JISC Preservation Strategy and in the JISC Circular 4/04 Call:
- Implement a preservation environment for several major institutional e-print repositories to ensure long-term preservation of their content
- Demonstrate a collaborative model using the OAIS reference model that brings together local repositories with national services
- Investigate the use of METS as a key element of the preservation environment, both as a metadata framework and as a transfer mechanism for data and metadata
- Investigate the use of open source and grid technologies as tools for the preservation process
Project Partners
The lead organisation is the Arts and Humanities Data Service (a SHERPA Development Partner and part of King's College London) with the University of Nottingham, (the lead institution for the SHERPA Project) as the named project partner.
The SHERPA project is funded by JISC and CURL under the FAIR Programme and aims to investigate issues to do with the future of scholarly communication and publishing. In particular, it is initiating the development of openly accessible institutional digital repositories of research output in a number of research universities.
The project is investigating the IPR, quality control and other key management issues associated with making the research literature freely available to the research community. Preservation activities include investigating the requirements for the long-term preservation of e-prints, including metadata requirements and economic models. This latter work is conducted by the Arts and Humanities Data Service.
The Arts and Humanities Data Service is a UK national service funded by the JISC and AHRB to collect, preserve and promote the electronic resources which result from research and teaching in the arts and humanities. By preserving collections the AHDS encourages further research and educational use of its collections. The identification and promotion of shared standards is critical to the AHDS's work.
Preserving and exchanging digital information relies upon their widespread adoption and the AHDS endeavours to use open standards and software wherever possible, including in the establishment of its own digital repository. The AHDS seeks to work in fruitful partnerships in order to enhance the production and preservation of high-quality digital resources of whatever type, and to use its skills and expertise in preservation for the benefit of the HE and FE communities.
In addition to the funding already supplied to the current SHERPA Project, the Consortium of University Research Libraries (CURL) Board has agreed to contribute a further sum of £25,000 to this project to fund participation from their members who are either development or associate partners in the SHERPA Project. The AHDS and the University of Nottingham will work with the SHERPA Management Board and the CURL Board to establish criteria for the selection of a further three partners to work with the project. SHERPA project partners will then be invited to submit a bid to the SHERPA Management Board to join this project.
The selection criteria will ensure that both DSpace and Eprints repositories are represented in order to ensure that the project tackles the preservation issues of the two most commonly used repository software applications. The criteria will also require that partners are well advanced in the establishment of their repository infrastructure, and that there is a broad spread of SHERPA partners, including the associate partners. The successful bidders will be required to sign a partner agreement agreeing to contribute as required to the project and to abide by the requirements laid down by the JISC.
Project Governance
We propose that the current SHERPA Management Board be asked to act as the Management Board for this project to ensure the closest coordination between the original SHERPA project and this project. The Project will comply with JISC requirements and will report to the JISC Programme Manager as required.