Engaging with research data producers
Institutional research data projects such as DataPool and others may focus on concrete outputs such as repositories or policies for research data. While these will be positive steps, ultimately these projects may be judged on the extent to which they can engage with research data producers and use that to inform the development of the outputs in the longer term. Here I simply want to connect two recent, and quite different, works that may help shape our thinking on engagement. This is a big topic that we will inevitably return to.
The two two works I refer to above are:
- A report Introducing research data, which includes a number of research data case studies at the University of Southampton, and is used as a guide for research students. This was conceived and developed by colleagues in DataPool.
- A section (‘What about the data?’) from a blog post Latest progress for RD@Essex.
In the Southampton report the case studies are preceded by sections on data categorisation (where data comes from, forms of data, and how it might be represented electronically) and the data lifecycle, which are used as templates for the case studies. The Essex work has a short data classification, compiled using a DAF-like approach, DAF (Data Asset Framework) being a well used tool that can help with user engagement. There are almost certainly other similar examples.
Both approaches – while clearly not identical in scope or scale, they have some points of alignment – begin to give us some insight into research data workflows. Workflows are the set of sequential processes by which data, in this case, and any intermediate forms and versions, are produced, stored and used. Most structured work involves workflow of some kind, but is typically of less interest when the workflow is well established. Interest in workflow grows when the workflow is in flux, which we believe it may be in many cases for research data.
How does this help engagement with data producers? First we need to identify and contact as many research data producers as we can within our institutions. Then we have to demonstrate an understanding of how they produce and use data, and structure the engagement to get some insights into how we can help them manage their data more effectively given the emerging requirements, policies and practices that will affect research data. At the heart of this will be modelling workflows that produce research data.
These classifications, categorisations and case studies can help us model research data workflows, which we can then use in turn as a framework to guide further engagement to find out more – about how workflows might be changing, how to optimise data management through all workflow stages, and how new institutional research data services can assist.
We want to differentiate ‘engaging’ from ‘advocacy’. Where the extended infrastructure required for research data repositories (e.g. storage, deposit interfaces) is not yet in place, we have a chance to engage with potential users to find out what services they want rather than trying to sell them a set of services we already have.
How are other research data projects engaging with data producers? We believe we have a strong basis for further engagement based on the Southampton report, but we are keen to learn from other examples such as RD@Essex.