Mar 25 2013

Institutional alignments for progressing research data management

Steve Hitchcock

Can visualisation of alignments – of people and ideas across an institution – reveal and predict progress towards research data management (RDM)?

DataPool has been seeking to institute formal RDM practices at the University of Southampton on three fronts – policy, technical infrastructure, and training – as we have noted before. In addition, the university has a longer-term roadmap looking years beyond the point reached in DataPool.

One aspect of this work we haven’t addressed is the alignments that have been instrumental in making progress on these three fronts. It follows that if we can visualise these alignments then not only does this chart progress but it may reveal¬†new alignments that need to be forged looking forward, and¬†where there may be gaps in existing alignments there could be lessons for future progress. Since in terms of these alignments the University of Southampton may be distinctive but not unique, this analysis might extend to other institutional RDM projects.¬†That is the idea, at least, behind the latest DataPool poster presentation, shown below,¬†prepared for the final JISC MRD Programme Workshop (25-26 March 2013, Aston Business School, Birmingham).

Within DataPool we have established formal and informal networks of people that connect with and cross existing institutional forums. For example, the project has close and regular contact with an advisory group of disciplinary experts, has established a network of faculty contacts, has been working with the multidisciplinary strands of the University Strategic Research Groups (USRGs), and with senior managers and teams in IT support (iSolutions) and Research and Innovation Services (RIS). At the apex, we have a high-level steering group that spans all of these areas with in addition senior institutional managers (Provost, Pro-VC) as well leaders from external data management organisations. A series of case studies provide insights into the current data practices and needs of those researchers who are data creators and users.

Returning to the three fronts of our investigations, we have reached either natural and expected conclusions ready to be taken forward beyond DataPool, or in some cases incomplete and possibly unexpected conclusions. Below we reveal and assess the alignments that have driven progress on these three fronts:

Policy. Approved by Senate, the¬†University’s ‘primary academic authority’, following recommendations from the¬†Research and Enterprise Advisory Group (REAG), and officially published within the University Calendar. This alignment did not happen by chance, but began to be formed by the library team through the IDMB project and was taken forward within DataPool. Supporting documentation and guidance for the policy is provided on the University Library web site. The policy is effective from publication, but with a ‘low-profile’ launch and follow-up it has by design not had widespread impact on researchers to date.

Data infrastructure. Research data apps for EPrints repositories, with selected apps installed on ePrints Soton, the institutional repository, which is now better structured for data deposit. Progress made with initial interfaces in Sharepoint, the university’s multi-service IT support platform, to describe data projects and facilitate data deposit; some user testing, but currently remains incomplete. On storage infrastructure it has not been possible to cost extensions to the existing institutional storage provision, a limitation in extending data services to large and regular data producers, who by definition are the most active data researchers.¬†One late development has been to embed support for minting and embedding DataCite DOIs for data citation in data repositories at Southampton.

Training and support. Principally extended towards PhD and early career researchers, and in-service support teams in the library. Plans to embed RDM training within the university’s extended support operations across all training areas, Gradbook and Staffbook. One highlight in this area is the uptake of support for data management planning (DMP), particularly at the stage of submitting research project proposals for funding.

In these examples we can see alignments spanning governance-IT-services-users.

From the brief descriptions of these fronts it can be seen that the existing alignments have brought us forward, but to go further we have to return to those alignments and reinforce the actions taken so far: to widen awareness, impact and uptake of policy; to provide adequate and usable RDM infrastructure for data producers; to develop and integrate training support within the primary delivery channels.

Almost all of these outcomes and the need for more follow-through can be traced to the alignments. However, the elusive element common across these alignments is the researcher and data producer, despite being a perennial target. Data initiatives, whether from institutions or wider bodies such as research funders, start out with the researcher in mind, but can lose momentum if the researcher appears not to engage. That may be because the benefits identified do not align with the interests of the researcher, or it may be because at a practical level the support and resources provided are insufficient. Thus the extended alignments required for full RDM do not materialise. Worse, the existing alignments can be prematurely discouraged, lack incentives and confidence to promote the real innovation they have delivered, in turn affecting investment decisions and service development.

Where the researcher is engaged the results can be quite different, as seen in the DataCite example, motivated and developed by researchers, and in DMP uptake, where researchers clearly begin to recognise both the emergence of good practice in digital data research and the need for compliance with emerging policy.

These alignments are a crucial but largely unnoticed aspect of DataPool, and no doubt of other similar #jiscmrd projects at other institutions as well. If this analysis is correct then for institutional-scale projects alignments can both reveal and predict progress.

Feb 6 2013

Love research data management: training event 14 February

Steve Hitchcock

Developing DataPool’s declared approach to student training for research data management at the University of Southampton, notably for¬†PhD and Early Career Researchers, an introductory session aimed at students in the WebScience Doctoral Training Centre (DTC) will be held on Thursday 14th February. The following notice for this event, with joining information, has been issued by the DTC office.

Research Data ‚Äď To Infinity and beyond ‚Ķ : Managing your research data for the future

Does your research data have life beyond your current project? Are there things that you can do now to make it easier to store, archive and share your data for future re-use?

Come along on Thursday 14 February at 13.00-14.00 to Rm 3073, building 32 where we will be raising awareness of ‚Äúgood practice‚ÄĚ principles in the management of research data that will help enable future sharing.

The session will include a talk from Mark Scott, PhD Researcher Faculty of Engineering and the Environment, who produced the ‚ÄúIntroducing Research Data‚ÄĚ guide, and Dorothy Byatt, co-Project Manager, DataPool project plus Patrick McSweeney demonstrating the ‚Äėhot-off-the-press‚Äô ePrints Research Data App.¬† There will be opportunity for discussion and questions, during and after the event.

Lunch provided ‚Äď RSVP Claire Wyatt by 12 February 2013. Please accept this invitation to reserve a place at this seminar.

Hashtag #webscirdm

Jan 29 2013

DataPool expands on student RDM training approaches at IDCC13, Amsterdam

Steve Hitchcock

Two presentations from the University of Southampton at the¬†8th International Digital Curation Conference (#IDCC13) set out its approach to providing training for research data management for postgraduates. Taking a broad approach, DataPool gave a poster¬†on working with PhD and Early Career Researchers, described as featuring “examples of essential building blocks coming out of researcher-focussed work”. In the main conference a team from the¬†Faculty of Engineering and the Environment presented a paper jointly with DataPool on a booklet “Introducing Research Data‚ÄĚ they have produced, and included some startling findings from initial training sessions with students. Below Mark Scott from that team introduces the booklet, and we reproduce the live Twitter record of the presentation, which highlights the main points from the talk identified by the Twitter reporters, particularly those findings on student responses.

Introducing Research Data booklet – cover sheet

We recently presented some of our postgraduate training material at the¬†IDCC conference in Amsterdam. With so much data out there, and much of¬†today’s research relying on large scale data sets, it is important to¬†educate researchers about their data – and its value – early.

Our approach was two-fold: a lecture to introduce research data management to first year postgraduates, and a booklet introducing the area. The talk concentrated mainly on the booklet we produced.

The booklet had three sections: an introduction to types of research data,¬†some case studies showing real-world examples of the types of data in use,¬†and some best practices. For the case studies, we looked at five¬†researchers’ work from medicine, materials engineering, aerodynamics,¬†chemistry, and archaeology, and tried to show the similarities and¬†differences between the data types they produce using the categories from¬†the first section.

The concepts in the booklet have been presented twice as a training lecture in the Faculty of Engineering and the Environment, and the material has also been used in the WebScience Doctoral Training Centre. The feedback from students suggest that being made to think about these issues is necessary and useful, and engaging them at this stage helps cultivate good practices.

Mark Scott

Mark Scott et al, #idcc13 slide 13, Feedback From Lectures

Mark Scott et al, #idcc13 slide 13, Feedback From Lectures

Below is the Twitter record of Mark’s talk on 16 January 2013, from #idcc13,¬†in the chronological sequence of posting.

Meik Poschen ‚ÄŹ@MeikPoschen¬†Next up is Mark Scott, University of Southampton on ‘Research Data Management Education for Future Curators’ #idcc13

Marieke Guy ‚ÄŹ@mariekeguy¬†#idcc13 Mark Scott from Uni of Southampton – post graduate training, created a magazine style booklet for all first year students

Archive Training ‚ÄŹ@archivetraining¬†Southampton University give magazine style RDM booklet to 1st year PG students. #IDCC13

@MeikPoschen Booklet to introduce RD to first year students with 3 sections: 1) five ways to think about RD, 2) case studies, 3) DM best practice#idcc13

Jez Cope @jezcope Good, thorough description of the Southampton approach to RDM education from Mark Scott. #idcc13

‚ÄŹ@mariekeguy¬†#idcc13 Uni of Southampton – 5 ways to think about data: creation, forms of research, electronic rep, size/structure, data lifcycle

Gail Steinhart ‚ÄŹ@gailst¬†Southampton’s RDM guide for first year post-graduate students: ¬†(PDF) #idcc13

Full link added:

@jezcope I want to see Southampton’s RDM booklet, which includes ways to think about data, case studies and best practices. #idcc13

SMacdee ‚ÄŹ@SMacdee¬†#idcc13 – mark scott (U of Soton) RDM education booklet for Postgrads: 5 ways to think about research data; case studies; best practices

‚ÄŹ@MeikPoschen¬†Booklet: 2) case studies giving an overview on various disciplinary examples, covering Genetics, Materials Engineering, Archaeology#idcc13

Mariette van Selm ‚ÄŹ@mvanselm¬†+1 RT @jezcope: I want to see Southampton‚Äôs RDM booklet, which includes ways to think about data, case studies and best practices.#idcc13

Odile Hologne ‚ÄŹ@Holo_08¬†Five Ways to Think About Research Data¬†¬†¬†Mark Scott course for students #idcc13

Corrected link:

@mvanselm¬†Mark Scott (Uni of Southampton) on RDM education: “When you start scaring students, they start paying attention” #idcc13 ūüôā

@mariekeguy #idcc13 Southampton noted that students happier with RDM lecture when delivered later in year Рhad real data experience at that stage

‚ÄŹ@jezcope¬†Feedback for Soton‚Äôs RDM lecture was better when it was delivered later in the year. PGRs need to have some data experience first?#idcc13

@archivetraining¬†Southampton’s feedback: RDM lecture was more positive when given in month 7 – when data collection was underway. The power of fear! #IDCC13

@MeikPoschen¬†Booklet now part of University of Southampton’s wider (10 year) training etc. scheme #idcc13

@archivetraining¬†Lots of RDM guides for lots of different audiences being shown-off at#IDCC13. Here’s ours [in German – English coming]¬†

For a view of wider coverage and activities with a research data training theme at #IDCC13, we return to our colleagues at @archivetraining:

“The impression I took from this bundle of presentations (mostly funded by the excellent¬†JISC Managing Research Data¬†programme) was that projects doing data management training or support have to effectively design a campaign strategy, as one would for an election. Digital curation is akin to a valence issue ‚Äď we all like sharable, long-term secure data ‚Äď but how we get there needs to be thought about.” More

Or for more views on #IDCC13 as a whole, see this collection of post-conference blog posts.

Jan 28 2013

Positive Poster’ing for IDCC 2013

Dorothy Byatt

Creating¬†the poster for the International Digital Curation Conference ¬†(#IDCC13) was different to the ones we have done thus far. Although very much linked to the DataPool project, the choice of the content was only restricted to being of interest to the theme of the conference –¬†“Infrastructure, Intelligence, Innovation: driving the Data Science agenda”. Our choice was to focus on our collaborative work by PhD and Early Career Researchers, that is, helping to embed and enable good research data management practices in the institution.

Gareth Beale and Hembo Pagi have been investigating 3D and 2D raster imaging being used in the University. We look forward to their report. A group of researchers came to a working lunch, led by iSolutions and the DataPool team,¬†to look at progress on a SharePoint data deposit option and provided valuable feedback. Another development that will be of great assistance to those looking to capture a snapshot of life and society is that of a twitter archiver using ePrints currently in beta development. One snapshot will be of #IDCC13 tweets. Yet another collaboration was with Mark Scott on his work on his ‘Introducing research data’ guide and on a data sharing system for the Heterogeneous Data Cente (HDC). More details of his work and paper¬†he presented will follow in our linked second IDCC blog. So there was our content, examples of essential building blocks coming out of researcher-focussed work.

And that just left the design …!

Jan 21 2013

Mapping training needs for the support team

Dorothy Byatt

In a large University many of the roles required to support Researchers in their work are spread across a variety of services. As each service has their own focus and strengths they are able to assist at the point in the research process most appropriate to their skills. What happens when there is a need to develop new skills, update existing ones or increase awareness of new requirements that impact all these services?  How do you identify who needs to know what or who needs to know about what other areas are doing?  While it may at first seem quite straightforward to identify the major areas where training might be required, where does this fit in with the research lifecycle and who therefore will need to have a greater awareness or expertise in specific areas? What skills do the researchers need and at what point?  Does this impact on what training is required? These are the questions that I have been taking time to consider with regard to research data management and how through training we can enhance our network of support.

wordle of training needs

There seems to be two key aspects to the training support teams.  The first is to look at what they already know, what areas they are responsible for and what they are currently doing to help the researcher? The second is to look at what support the researcher needs, at the what point in the research lifecycle and who will provide this?  Having looked at this it seemed that it was important to be able to map these aspects in some way.  There has been a lot of recent work looking at the needs of the Researcher resulting in the Research Development Framework, and the Information Literacy Lens based on it, that very much show the researcher at the centre of the process, as they should be.  It seemed sensible to take this approach in the initial scoping of the training needs and see where the skills fit in the research lifecycle.

Taking the research lifecycle steps listed in Brewerton (2012, p.104) and using some of the aspects of knowledge understanding identified in the Information Literacy Lens, work began on trying to map the advice, guidance and support the researcher might need. This proved useful in breaking down the various types of questions that might arise and scoping for the future. It was also helpful in thinking through the wider implications of data management on policy and procedures, how data fits in with other professional development and the need for integrated support.

The¬†DataPool Training Matrixv1¬†is the result of this work. There are empty boxes where you might expect content because it was covered elsewhere and we didn’t want to repeat. There are probably still skills that could be included, but we have learned what we needed to for our current purposes. It is still, to a lesser extent, a work in progress, and we would welcome comments.

Brewerton, A 2012 Re-Skilling for Research: Investigating the needs of researchers and how library staff can best support them. New Review of Academic Librarianship 18(1):96-110