Dec 20 2012

Connecting research data roadmaps and business cases: the IDMB example for the University of Southampton

Steve Hitchcock

The sausage in the roll or the wafer-thin ham in the sandwich, as promised in the last post this is the alternative to the ubiquitous benefits-evidence slides presented by each project represented at the JISC MRD workshop in Bristol. This presentation connects the development of roadmaps with the business case and policy for making progress with research data management (RDM) at an institutional level.

This was presented by Steve Hitchcock, but draws heavily on a report from the Institutional Data Management Blueprint (IDMB) Project, which began the work on research data management (RDM) at the University of Southampton now being taken on by DataPool. Mark Brown, Oz Parchment and Wendy White, co-authors of that report, are therefore the true authors of this presentation. Comment and interpretation are mine.


This version provides the notes for each slide used to inform the commentary for the presentation. It might be worth opening the Slideshare site (adverts notwithstanding) to switch between the slide notes below and the graphic slides – clicking on View on Slideshare in the embedded view will open these in a separate browser window.

Slide 2 Taking the IDMB example with others, connecting roadmaps with the business case and policy seems like a logical sequence, but in practice this is not always the case. At Southampton we have a roadmap and an official institutional research data policy, but the business case is still to be approved. Other institutions appear to have begun with a policy. Here we will focus on the roadmap and business case rather than policy.

Slide 3 If the IDMB project elaborated the roadmap, DataPool represents progress along the first part (18 months) of the first phase (3 years) of the plan, and is beginning to fill in components of the map, as can be seen by the links in this slide.

Slide 4 For reference, this is a recent poster designed to show graphically the full scope of the DataPool Project. It shows the characteristic tripartite approach of this and comparable JISC institutional RDM projects: policy, training, and technical infrastructure (data repository and storage services).

Slide 5 This middle phase of the Southampton RDM roadmap looks like it may have been the trickiest part of the map to elaborate. It’s not imminent and depends on outcomes from the first stage; on the other hand, it’s not that far away that we don’t need to be aware and making plans for it. As seen in this extract, it is essentially describing refinements of many of the expected developments from stage 1.

Slide 6 If looking ahead is trickier than framing immediate work, this final phase looking up to 10 years ahead might have been hardest to describe. It is, however, more aspirational in tone and less inclined to deal with specifics, and seems more appropriate for adopting that approach.

Slide 7 A recent and interesting comparison with the Southampton RDM roadmap is that from Edinburgh University. Edinburgh has a target completion date of early 2014, a startlingly short roadmap compared with a 10Y example. The two are not directly comparable, of course. The Edinburgh case looks to be a well specified, well structured and comprehensive first phase and can be commended for that. Whether it is achievable within the time and resources specified we cannot judge yet. The illustration reproduced here is a helpful representation of the plan – at least, it is once you’ve read the plan.

Slide 8 This extract connects the first progress report of the DataPool Project, by then-PI Mark Brown, with the roadmap and policy. It makes the clear point that research funder requirements (EPSRC, RCUK) had an important influence on adoption of the policy at an executive level, even if some discussion at this JISC MRD Benefits Meeting was around whether supporting compliance with such requirements can usefully be presented to researchers as a ‘benefit’.

Slide 9 Other JISC MRD projects that have roadmaps have similarly emphasised the importance of EPSRC requirements on the production of the roadmap.

Slide 10 Now we move on to the second part of the talk, the business case. The data.bris project from Bristol University was presenting in the same session at this event, so we will spare the detail here, but this extract from a recent blog post by the project illustrates some of the imponderables, Donald Rumsfeld-style, of forming a business case for RDM.

Slide 11 We are heading towards the critical part of this presentation, the financial numbers. First some context. This case covers just the technical infrastructure – IT services – not the wider factors outlined by data.bris. This business model has been updated and presented at the University of Southampton and, as we have already indicated is currently undergoing further revision with a view to official acceptance. The assumption stated here is not based on the university’s current research data policy, which requires a record of all data produced in the course of research at the institution rather than full data deposit. The university can’t be said, therefore, to have stopped short, so far, of accepting the business case for supporting the costs of the policy. The data on usage of storage services and projected usage are the basis for the financials that follow.

Slide 12 In the style of the financial services industry, given there are a number of uncertain factors to accommodate in projections of the growth of storage requirements, this chart attempts to draw upper and lower bounded curves to underpin the calculations.

Slide 13 This illustration also comes directly from the IDMB report. Allowing that the metadata should ideally attach to both active and archive layers, the cost factors introduced here are access bandwidth latency and storage technology. The basic choices considered are between more expensive and faster access disk storage, and slower tape stores.

Slide 14 Now we get to the actual financial numbers resulting from this analysis. The number that stands out is Y3 in the disk-based scenario, which not only rises above £1M for the first time but gets closer to £2.4M. Subsequent annual costs shown here remain above £1M for this scenario. The slower tape-based costs are always lower.

Slide 15 Having identified the numbers, the critical decision is how to pay for it. This was an important issue for the second DataPool Steering Group meeting recently. A full free-at-point-of-use service may be the simplest if most expensive option for the institution, but it has been strongly argued that RDM must be viewed as a direct cost of research, and funded accordingly. The dilemma for institutions is how much to invest in infrastructure directly, compared with leaving projects to raise additional costs for data management and risking research bids becoming less competitive than those from institutions with more generous direct support.

Slide 16 In summary, roadmaps are useful for focussing discussion on research data management at an institutional level, and for engaging other stakeholders across all disciplines. Given that a roadmap should be based on prior consultations with those stakeholders, it follows that subsequent interaction with the roadmap should lead to further consultation. The roadmap must therefore be used as a living document. Southampton has not yet finalised its business case for supporting RDM, but it has established a process through engaging with the roadmap in the first instance.


Dec 20 2012

DataPool benefits-evidence table

Steve Hitchcock

JISC, funder of DataPool, of other projects in research data management, and many more projects on widening use of digital technology in education, tends to focus on areas close to practical exploitation. On the R&D spectrum, it is typically towards the development end. For project managers, therefore, there is an emphasis on procedures and tools to increase the impact of practical outcomes – evaluation, sustainability, exit strategies, technology transfer, etc.

Another planning tool being adopted in the Managing Research Data Programme (MRD) 2011-13, of which DataPool is a part, is benefits-evidence analysis. As this description suggests, the idea is to elaborate prospective benefits of a project, and then identify the evidence that will demonstrate whether or not the benefit has been realised. It is as much about informing the process of getting to the results, and identifying which results are important and achievable, as the results themselves.

Hence, JISC MRD projects were invited to Bristol for a 2-day programme workshop at the end of November to present their benefits-evidence slides. If this sounds a little repetitive, it is but not uninteresting, especially as in preparing for the workshop all projects had essentially to engage in the same analysis, and were therefore armed not just with their own slide but ready to comment on others.

For project managers used to working towards outputs (products or services arising from the project) and outcomes (effects of the outputs on users in the target community), benefits are another factor. Hence, the JISC MRD programme has recruited a team of evidence gatherers, to work with and assist projects to hone and refine the benefits they are working towards and the consequent evidence measures. “Those are more outputs than benefits” I was advised, fairly, during open discussion on some ‘benefits’ in my slide. But then I had seeded the slide with points to discuss rather than a definitive list, and unwittingly extended the project’s previously discussed benefits.

So after the workshop I was grateful for the advice of Laura Molloy, evidence gatherer for DataPool, on aligning our pre- and post-workshop benefits lists.

After all that effort it would be a remiss not to reveal our benefits-evidence table that emerged from the process. For the record, here are the benefits DataPool will seek to demonstrate in its final months into early 2013.

DataPool: Benefits-Evidence

1 Improved RDM skills across the target community, including researchers and professional support staff Qual reporting on effectiveness of training events.
Feedback from training courses and deskside consultations, DMP and email help services.
More staff running RDM support services, increased service offer.
2 Greater visibility and use of institution’s research data / research outputs through sharing, collaboration, reuse Qual case study describing improved dataset exposure.
Qual evidence of DMP engagement, including early indications of access routes.
* Quant indication of increase in dataset downloads.
No. of datasets stored in data repository.
Accesses of open datasets vs closed datasets vs shared datasets.
3 Sustained institutional support for RDM / sustainability for RDM infrastructure at institution No. of training opportunities introduced.
Scope of: deskside consultations, DMP support service.
Results from case studies – engagement with existing data facilities.
Assessment of added value for institution of using institutional storage over other options – report.
4 Improved use/uptake of RDM infrastructure Quant account of ‘bid preparation consultations’, inc. qual narrative of referrals to data policy and DMP help.
Case study on working with data policy – feedback on uptake of policy.
Quant tracking of higher attendance at training.
Accesses to RDM guidance documents.
No. of deskside consultations.
* Quant indication of improved uptake of institutional storage and deposit options.
No. of large data projects switching to institutional data service.
5 Time / costs saved by improved RDM infrastructure Identifying early cost-benefits – combined case studies report, inc large data projects, open data, imaging, disciplinary efficiencies.
Assessment of added value for institution of using institutional storage over other options – report (see 3).

* This evidence not expected to be available during DataPool Project, following launch of RDM repository service by project end, but will be collected in ongoing work at Southampton University on institutional RDM. Table by Steve Hitchcock for DataPool, in collaboration with Wendy White, Dorothy Byatt. We gratefully acknowledge the feedback and suggestions from Laura Molloy, JISC evidence gatherer.

The University of Southampton has a 10 year roadmap for research data, of which DataPool represents the first stretch of road, so there is a commitment to go further, but the clearer the steer from DataPool the faster the progress afterwards.

As a little light relief from projects’ benefits-evidence slides, a presentation on the Southampton roadmap and business plan was given at the Bristol workshop. That will be covered in a separate post.

How will you know which benefits have been achieved as the project moves forward? This post is tagged with the label ‘benefits’. All updates reporting evidence from the table above will use this tag. Tags can be found in the column immediately to the right of this one, and up, from this point in the post.

This is how other JISC MRD projects are tackling these challenges and what benefits-evidence are being targetting:


Dec 7 2012

DataPool Steering Group, second meeting

Steve Hitchcock

Monday 12 November marked the start of a busy week for DataPool, being the date of the project’s second Steering Group meeting and leading towards a presentation at the 9th meeting of the DCC Research Data Management Forum. In other words, the project was to address two of its key audiences, and had to prepare appropriate documentation for the purpose. We are pleased to share the documentation, starting here with that presented to the Steering Group ahead of its meeting, complementing the record of the first Steering Group meeting.

Collected documents for 2nd Steering Group meeting

Agenda, Steering Group meeting, 12 November 2012
Minutes of previous Steering Group meeting, 31 May 2012
Progress Report by Wendy White, DataPool PI (corrected 20 November 2012)

Introduction to the Progress Report. At the last Steering Group there was a clear emphasis on the importance of supporting cultural change and identifying institutional benefits to improving research data management practice. Recent policy developments from funders have aligned parameters for the accessibility of research data to strengthening requirements for research publications.  There is a focus on benefits- led activity, working with Funders and other external bodies on developing an integrated approach to improving research data management practice. The mid-phase of the project has been informed by this context as we have made progress on the key strands of the project:

  • Developing and rolling out service and training models to work with researchers
  • Planning an evidence-based programme of support for professional services staff providing these services
  • Multidisciplinary engagements
  • Investigating requirements for data storage and archiving
  • Testing the SharePoint and ePrints data catalogue components

PGR Thesis Model: mapping support from start to award, a work-in-progress, particularly with regard to the role of data in the examiners’ process

Note, two documents provided to the Steering Group were from ongoing work and were for current information rather than this record. These were a draft training needs questionnaire aimed at research support staff, and an update report on a 3D data survey at the University of Southampton.

Among the many issues discussed at the meeting, one noteworthy topic was funding models to support a storage strategy, i.e. once the costs have been mapped, does the funding come from grant funding bid applications or from institutional support infrastructure funds? We are particularly grateful to our external (i.e. outside Southampton) steering group members for the additional perspectives they bring, in this case for the valuable insights on the storage funding issue from research councils and data archives.

Members of the steering group present at the meeting (University of Southampton unless otherwise indicated): Wendy White (Chair, DataPool PI and Head of Scholarly Communication), Philip Nelson (Pro-VC Research), Mark Brown (University Librarian), Helen Snaith (National Oceanography Centre Southampton), Mylene Ployart (Associate Director, Research and Innovation Services), Louise Corti (Associate Director, UK Data Archive), Oz Parchment (iSolutions), Les Carr (Electronics and Computer Science), Simon Cox (Engineering Sciences), Graeme Earl (Humanities), Jeremy Frey (Chemistry), Dorothy Byatt, Steve Hitchcock (DataPool Project Managers). Apologies from: Adam Wheeler (Provost and DVC), Graham Pryor (Associate Director, Digital Curation Centre), Sally Rumsey (Digital Collections Development Manager at The Bodleian Libraries, University of Oxford).


Jul 13 2012

DataPool Steering Group, first meeting

Steve Hitchcock

Before the UK took a break for the Diamond Jubilee weekend, DataPool had an important diary date of its own at the end of May, the first meeting of the project steering group, effectively ending phase 1 of the project.

The steering group includes senior managers and academics from the University of Southampton, and experts in running research data repositories elsewhere. This post collects and links to the documents and evidence that were circulated prior to the meeting or presented at it, and which informed discussion. We conclude by summarising and highlighting some of the main steers and outcomes of the meeting that will direct the project going forward to phase 2.

Collected documents for the Steering Group meeting

DataPool service model at the University of Southampton

A forthcoming report will give more detail on the SharePoint and EPrints developments. Similarly, another post here will consider progress with the institutional RDM policy and the accompanying guidance information.

Main steers and outcomes

So what did we learn from the meeting? Among such an eminent gathering and across a wide-ranging discussion it would be hard to represent all views in this short report. Important issues raised will be pursued in the project. To give an indication of some of those directions, here are just three of the more immediate actions identified by the project managers:

  1. There was endorsement for the concise policy guidance notes and iterative approach to engagement and evaluating progress. First guidance notes are now available, and the collection will be extended.
  2. The case study-based training guide was received enthusiastically and regarded as an ‘approach that could evolve incrementally’. Further case studies based on this model will be identified, and used for postgraduate training in more areas.
  3. More detailed disciplinary/multidisciplinary cost modelling case studies are needed to build evidence to support bids for significant institutional investment.

Overall, the meeting expressed a view that the project is working along the right lines, and it was interesting to note from our external advisers that in many cases we are dealing with similar issues to those faced by others.

We are grateful and thank members of the steering group for their commitment and contributions. With their encouragement and direction DataPool is able to tackle the challenges ahead with conviction.

Members of the steering group present at the meeting (University of Southampton unless otherwise indicated): Mark Brown (Chair and University Librarian), Philip Nelson (Pro-VC Research), Adam Wheeler (Provost and DVC), Pete Hancock (iSolutions, Director), Helen Snaith (National Oceanography Centre Southampton), Mylene Ployart (Associate Director, Research and Innovation Services), Graham Pryor (Associate Director, Digital Curation Centre), Sally Rumsey (Digital Collections Development Manager at The Bodleian Libraries, University of Oxford), Louise Corti (Associate Director, UK Data Archive), Les Carr (Electronics and Computer Science), Simon Cox (Engineering Sciences), Graeme Earl (Humanities), Wendy White (Head of Scholarly Communication), Dorothy Byatt, Steve Hitchcock (DataPool Project Managers). Apologies from Jeremy Frey (Chemistry).