Dec 20 2012

Connecting research data roadmaps and business cases: the IDMB example for the University of Southampton

Steve Hitchcock

The sausage in the roll or the wafer-thin ham in the sandwich, as promised in the last post this is the alternative to the ubiquitous benefits-evidence slides presented by each project represented at the JISC MRD workshop in Bristol. This presentation connects the development of roadmaps with the business case and policy for making progress with research data management (RDM) at an institutional level.

This was presented by Steve Hitchcock, but draws heavily on a report from the Institutional Data Management Blueprint (IDMB) Project, which began the work on research data management (RDM) at the University of Southampton now being taken on by DataPool. Mark Brown, Oz Parchment and Wendy White, co-authors of that report, are therefore the true authors of this presentation. Comment and interpretation are mine.

This version provides the notes for each slide used to inform the commentary for the presentation. It might be worth opening the Slideshare site (adverts notwithstanding) to switch between the slide notes below and the graphic slides – clicking on View on Slideshare in the embedded view will open these in a separate browser window.

Slide 2 Taking the IDMB example with others, connecting roadmaps with the business case and policy seems like a logical sequence, but in practice this is not always the case. At Southampton we have a roadmap and an official institutional research data policy, but the business case is still to be approved. Other institutions appear to have begun with a policy. Here we will focus on the roadmap and business case rather than policy.

Slide 3 If the IDMB project elaborated the roadmap, DataPool represents progress along the first part (18 months) of the first phase (3 years) of the plan, and is beginning to fill in components of the map, as can be seen by the links in this slide.

Slide 4 For reference, this is a recent poster designed to show graphically the full scope of the DataPool Project. It shows the characteristic tripartite approach of this and comparable JISC institutional RDM projects: policy, training, and technical infrastructure (data repository and storage services).

Slide 5 This middle phase of the Southampton RDM roadmap looks like it may have been the trickiest part of the map to elaborate. It’s not imminent and depends on outcomes from the first stage; on the other hand, it’s not that far away that we don’t need to be aware and making plans for it. As seen in this extract, it is essentially describing refinements of many of the expected developments from stage 1.

Slide 6 If looking ahead is trickier than framing immediate work, this final phase looking up to 10 years ahead might have been hardest to describe. It is, however, more aspirational in tone and less inclined to deal with specifics, and seems more appropriate for adopting that approach.

Slide 7 A recent and interesting comparison with the Southampton RDM roadmap is that from Edinburgh University. Edinburgh has a target completion date of early 2014, a startlingly short roadmap compared with a 10Y example. The two are not directly comparable, of course. The Edinburgh case looks to be a well specified, well structured and comprehensive first phase and can be commended for that. Whether it is achievable within the time and resources specified we cannot judge yet. The illustration reproduced here is a helpful representation of the plan – at least, it is once you’ve read the plan.

Slide 8 This extract connects the first progress report of the DataPool Project, by then-PI Mark Brown, with the roadmap and policy. It makes the clear point that research funder requirements (EPSRC, RCUK) had an important influence on adoption of the policy at an executive level, even if some discussion at this JISC MRD Benefits Meeting was around whether supporting compliance with such requirements can usefully be presented to researchers as a ‘benefit’.

Slide 9 Other JISC MRD projects that have roadmaps have similarly emphasised the importance of EPSRC requirements on the production of the roadmap.

Slide 10 Now we move on to the second part of the talk, the business case. The data.bris project from Bristol University was presenting in the same session at this event, so we will spare the detail here, but this extract from a recent blog post by the project illustrates some of the imponderables, Donald Rumsfeld-style, of forming a business case for RDM.

Slide 11 We are heading towards the critical part of this presentation, the financial numbers. First some context. This case covers just the technical infrastructure – IT services – not the wider factors outlined by data.bris. This business model has been updated and presented at the University of Southampton and, as we have already indicated is currently undergoing further revision with a view to official acceptance. The assumption stated here is not based on the university’s current research data policy, which requires a record of all data produced in the course of research at the institution rather than full data deposit. The university can’t be said, therefore, to have stopped short, so far, of accepting the business case for supporting the costs of the policy. The data on usage of storage services and projected usage are the basis for the financials that follow.

Slide 12 In the style of the financial services industry, given there are a number of uncertain factors to accommodate in projections of the growth of storage requirements, this chart attempts to draw upper and lower bounded curves to underpin the calculations.

Slide 13 This illustration also comes directly from the IDMB report. Allowing that the metadata should ideally attach to both active and archive layers, the cost factors introduced here are access bandwidth latency and storage technology. The basic choices considered are between more expensive and faster access disk storage, and slower tape stores.

Slide 14 Now we get to the actual financial numbers resulting from this analysis. The number that stands out is Y3 in the disk-based scenario, which not only rises above £1M for the first time but gets closer to £2.4M. Subsequent annual costs shown here remain above £1M for this scenario. The slower tape-based costs are always lower.

Slide 15 Having identified the numbers, the critical decision is how to pay for it. This was an important issue for the second DataPool Steering Group meeting recently. A full free-at-point-of-use service may be the simplest if most expensive option for the institution, but it has been strongly argued that RDM must be viewed as a direct cost of research, and funded accordingly. The dilemma for institutions is how much to invest in infrastructure directly, compared with leaving projects to raise additional costs for data management and risking research bids becoming less competitive than those from institutions with more generous direct support.

Slide 16 In summary, roadmaps are useful for focussing discussion on research data management at an institutional level, and for engaging other stakeholders across all disciplines. Given that a roadmap should be based on prior consultations with those stakeholders, it follows that subsequent interaction with the roadmap should lead to further consultation. The roadmap must therefore be used as a living document. Southampton has not yet finalised its business case for supporting RDM, but it has established a process through engaging with the roadmap in the first instance.

Dec 20 2012

DataPool benefits-evidence table

Steve Hitchcock

JISC, funder of DataPool, of other projects in research data management, and many more projects on widening use of digital technology in education, tends to focus on areas close to practical exploitation. On the R&D spectrum, it is typically towards the development end. For project managers, therefore, there is an emphasis on procedures and tools to increase the impact of practical outcomes – evaluation, sustainability, exit strategies, technology transfer, etc.

Another planning tool being adopted in the Managing Research Data Programme (MRD) 2011-13, of which DataPool is a part, is benefits-evidence analysis. As this description suggests, the idea is to elaborate prospective benefits of a project, and then identify the evidence that will demonstrate whether or not the benefit has been realised. It is as much about informing the process of getting to the results, and identifying which results are important and achievable, as the results themselves.

Hence, JISC MRD projects were invited to Bristol for a 2-day programme workshop at the end of November to present their benefits-evidence slides. If this sounds a little repetitive, it is but not uninteresting, especially as in preparing for the workshop all projects had essentially to engage in the same analysis, and were therefore armed not just with their own slide but ready to comment on others.

For project managers used to working towards outputs (products or services arising from the project) and outcomes (effects of the outputs on users in the target community), benefits are another factor. Hence, the JISC MRD programme has recruited a team of evidence gatherers, to work with and assist projects to hone and refine the benefits they are working towards and the consequent evidence measures. “Those are more outputs than benefits” I was advised, fairly, during open discussion on some ‘benefits’ in my slide. But then I had seeded the slide with points to discuss rather than a definitive list, and unwittingly extended the project’s previously discussed benefits.

So after the workshop I was grateful for the advice of Laura Molloy, evidence gatherer for DataPool, on aligning our pre- and post-workshop benefits lists.

After all that effort it would be a remiss not to reveal our benefits-evidence table that emerged from the process. For the record, here are the benefits DataPool will seek to demonstrate in its final months into early 2013.

DataPool: Benefits-Evidence

1 Improved RDM skills across the target community, including researchers and professional support staff Qual reporting on effectiveness of training events.
Feedback from training courses and deskside consultations, DMP and email help services.
More staff running RDM support services, increased service offer.
2 Greater visibility and use of institution’s research data / research outputs through sharing, collaboration, reuse Qual case study describing improved dataset exposure.
Qual evidence of DMP engagement, including early indications of access routes.
* Quant indication of increase in dataset downloads.
No. of datasets stored in data repository.
Accesses of open datasets vs closed datasets vs shared datasets.
3 Sustained institutional support for RDM / sustainability for RDM infrastructure at institution No. of training opportunities introduced.
Scope of: deskside consultations, DMP support service.
Results from case studies – engagement with existing data facilities.
Assessment of added value for institution of using institutional storage over other options – report.
4 Improved use/uptake of RDM infrastructure Quant account of ‘bid preparation consultations’, inc. qual narrative of referrals to data policy and DMP help.
Case study on working with data policy – feedback on uptake of policy.
Quant tracking of higher attendance at training.
Accesses to RDM guidance documents.
No. of deskside consultations.
* Quant indication of improved uptake of institutional storage and deposit options.
No. of large data projects switching to institutional data service.
5 Time / costs saved by improved RDM infrastructure Identifying early cost-benefits – combined case studies report, inc large data projects, open data, imaging, disciplinary efficiencies.
Assessment of added value for institution of using institutional storage over other options – report (see 3).

* This evidence not expected to be available during DataPool Project, following launch of RDM repository service by project end, but will be collected in ongoing work at Southampton University on institutional RDM. Table by Steve Hitchcock for DataPool, in collaboration with Wendy White, Dorothy Byatt. We gratefully acknowledge the feedback and suggestions from Laura Molloy, JISC evidence gatherer.

The University of Southampton has a 10 year roadmap for research data, of which DataPool represents the first stretch of road, so there is a commitment to go further, but the clearer the steer from DataPool the faster the progress afterwards.

As a little light relief from projects’ benefits-evidence slides, a presentation on the Southampton roadmap and business plan was given at the Bristol workshop. That will be covered in a separate post.

How will you know which benefits have been achieved as the project moves forward? This post is tagged with the label ‘benefits’. All updates reporting evidence from the table above will use this tag. Tags can be found in the column immediately to the right of this one, and up, from this point in the post.

This is how other JISC MRD projects are tackling these challenges and what benefits-evidence are being targetting: