Mar 22 2012

Data management plans (DMPs): the day has arrived

Steve Hitchcock

Changed Days at Paddington Basin, Colin Smith, Geograph Project

Updated 28 March 2012

The day has arrived for data management plans (DMPs). It’s tomorrow (Friday 23rd March 2012) when Research Data Management Planning Projects from the JISCMRD (2011-13)┬áprogramme convene a workshop in Paddington, London, to present their findings and results. But has the day for DMPs arrived in a bigger sense? Are DMPs pivotal to research data management? I suspect so, and at the meeting I will be looking for evidence to support the assertion, or not.

DMPs are the link between the conception and proposal of research projects, and the later production of data from those projects. These plans can be extensive and demanding to produce, but as a result the information they contain should be invaluable to data repositories. This is not the type of information the researcher is likely to provide again at the point of depositing data in an institutional data repository.

DMPs represent carefully planned information on the project and predict the existence of data, in some cases precisely. This creates a link with emerging research data policy, which requires an open record of data produced in the course of funded research and the effective management and storage of that data. DMPs have a role to play in monitoring and ensuring the completeness of the records.

This approach raises a series of questions about DMPs. What is the scope of a DMP, and who defines this? This is most likely to be the research funder, but might be institutions in other, non-funded cases. In what form will the DMP be completed? Presumably online. Where will online DMPs be hosted? The Digital Curation Centre, not a research funder, hosts the DMP Online tool. Should institutions create and/or host DMP tools? To what extent will it be possible to (pre-) populate data repository records from DMPs? How comfortable are funders, institutions and researchers about sharing and publishing information from DMPs? The answers will involve specifying where DMPs fit the researcher’s workflow, ensuring there is no duplication of effort, and allowing DMPs to be driven by the needs of research and researchers, not by systems requirements or other special pleading.

These are some of the issues that will be in my mind when listening to the DMP project presentations tomorrow, and which I shall report on afterwards.

Update. Following the JISC DMP meeting in Paddington, Meeting (Disciplinary) Challenges in Research Data Management Planning Workshop, and further feedback from key presenters, some of the questions I posed can begin to be answered. For me the key presentations in this context were by Kerry Miller and Adrian Richardson from DCC, who updated the meeting on future plans for the DMP Online tool, and from David Shotton, a zoology researcher at Oxford University who has been compiling a more researcher-friendly set of 20 DMP questions.

First, we were shown how DMP Online v3.0 now includes selectable templates for e.g. different funding council requirements, so we can see how customisation of DMP input forms is beginning to take shape. We can also see from the plans for DMP Online that one possibility being considered for the tool is Ability to host locally within institutions. This was my choice in the selection exercise, but it appears others did not rank this feature so highly. My recollection is this group exercise was somewhat curtailed, and the tied rankings for many features suggest the returns were not high, so I hope the development team will not feel bound to the ranking of these results.

Now to David Shotton’s analysis of a short, customised set of DMP questions. If we are to host locally and customise DMP tools, we need to be careful we do not get away from the core requirements of the forms, which are not simply to suit individuals or institutions. They still have to be grounded in formal funder and research requirements. So having framed his 20 DMP questions, David looked into comparing and aligning his questions with known sets of DMP questions, including from DMP Online and the US equivalent DMP Tool, and others. To make this alignment David created a downloadable spreadsheet containing the aligned DMP questions,┬áwhich can be found in a link towards the end of the blog post. An analysis of this comparison is provided in a follow-up post, also linked.

That’s enough for this update. It’s time to look at David Shotton’s analysis and at the plans for DMP Online in more detail. I expect to return to this. I can’t answer yet my latter questions on whether researchers and funders will be happy with the approaches being considered here, nor how soon this work might come to fruition by integrating DMP tools in data repository-based research workflow. Even to suggest this phrase leaves me accused of over-egging this particular pudding. What I can say is that answers to my first questions, on customisation and hosting, have begun to be revealed and they are highly encouraging.