Research data cataloguing at Southampton using Microsoft SharePoint and EPrints
Motivated by the JISC RDM Programme many UK research institutions are implementing research data repositories. A variety of repository platforms, from original solutions such as DataFlow at Oxford, as well as established digital repositories such as DSpace and EPrints, have been adopted. The University of Southampton is unusual in pursuing two repository options.
DataPool has been developing a data cataloguing facility based on Microsoft SharePoint, which provides a number of IT services at the university, and on EPrints repository software. An earlier presentation considered the development of SharePoint and EPrints as emerging research data repositories in the context of high-level ‘architectural’ and practical ‘engineering’ challenges. A new report describes further progress along both repository routes, notably collaboration with the JISC Research Data @Essex project on ReCollect, a standards-based data deposit application for EPrints.
The ReCollect plugin, or app, provides an EPrints repository with an expanded metadata profile for describing research data based on DataCite, INSPIRE and DDI standards. The metadata profile was designed by Research Data @Essex, and packaged as an EPrints app in collaboration with Patrick McSweeney at the University of Southampton. The resulting app first appeared on 27 March 2013 in the EPrints Bazaar, which advertises and distributes applications that can be installed in an EPrints repository with one click.
The list of data types that can be deposited in EPrints was expanded to include Dataset and Experiment, to support the submission of research data, in 2007. Selection of a data type presents the depositor with a series of pages and fields that are designed to be appropriate for the description of that type. The order in which these pages and fields are presented defines the deposit ‘workflow’ for the data type, and is typically customised to specific repository implementations by institutions.
If workflow for different data types is provided directly by EPrints, and can be customised to local repository needs, what is the need for an app that implements data deposit workflow? First, it can simplify and speed up customisation if the desired workflow can be implemented from an app. Second, and more importantly for research data workflow, it can lead to greater standards compliance, consistency and collaboration between repositories.
Repositories can customise the data deposit workflow provided by ReCollect from the profile designed by Research Data @Essex without affecting standards compliance. The new report compares an example of customising the workflow for the ePrints Soton institutional repository with the ReCollect original.
A community of potential users for ReCollect, including the EPrints repositories at Glasgow University and Leeds University, has been established through webinar-based conference calls.
Further modifications to the Southampton workflow are likely, including the facility to automate minting and embedding of British Library DataCite DOIs, designed for data citation, for each data record.
In the case of SharePoint, user interface forms for creating data records have been piloted and tested. The approach is distinctive in creating two linked forms, one to describe a project, the other to record a dataset, rather than a single workflow as in the case of EPrints. This development of SharePoint for description and storage of research data is part of a longer-term extension and integration of services provided on the platform at Southampton.
So in the first instance research data cataloguing at Southampton uses EPrints, extending the existing institutional repository by installing the Essex ReCollect data app. The service went live on ePrints Soton in April 2013.
What is needed now, however, is practice and experience with real data collections. In this respect many questions about the use of data repositories remain open. These early implementations are likely to change significantly as that process evolves.
For more on this DataPool case study see the full report.