Tips for preparing a Data Management Plan, based on the BiodivERsa guide
Fecha de la noticia: 27-10-2020

BiodivERsA is a network of organisations focused on research on biodiversity and ecosystems in European countries and territories, to promote their conservation and sustainable management. Among other actions, this network has published a Guidance document on data management, open data, and the production of Data Management Plans in the framework of scientific research.
The document has been developed in the context of Horizon 2020, with the aim of guiding project teams funded through joint transnational research calls in the drafting and development of their Data Management Plan, with a focus on making their data and publications as open as possible.
The report begins with an introduction on the importance of scientific data and its management, the principles of open science, open data and FAIR principles. It then analyses the concepts and needs of data management in the context of this type of internationally funded project, and ends by highlighting a number of interesting tools and resources.
The importance of scientific data and its management
Organising and making data accessible is becoming increasingly important in the world of science, with the aim of improving traceability and encouraging data sharing. This improves the transparency of studies, but also encourages the reuse of data in new research that generates knowledge for the benefit of society.
The guide refers to a survey carried out by CrowdFlower. This study states that, far from what can be imagined, data scientists do not invest most of their time in building algorithms, exploring data or carrying out predictive analysis. On the contrary, the reality is that most of their time is spent on cleaning and organizing the data. Therefore, an improvement in this aspect would mean a great advance in efficiency, resource optimization and cost reduction.
The authors of the guide stress that science today requires more systematic open access to scientific data and, to this end, they analyse concepts such as data sharing, open access or FAIR principles that scientific data must comply with in order to be shared to its full potential.
Keys and benefits of developing a Data Management Plan (DMP)
A Data Management Plan (DMP) is a document that describes the management cycle of the data that will be collected and processed when generating a research project. The report draws on this report to highlight the main benefits of creating and developing a DMP:
- Increased efficiency during the project
- Allows data to be collected and stored in a more structured way
- Prevents or minimises the risk of data loss
- Allows data to be shared and reused with guarantees
- Increases the verifiability of the investigation
- Increases the longevity of the project by making data available even after the project ends
Structure of a Data Management Plan
DMPs are unique: their content, composition and structure can vary greatly as they depend on the project and the data generated. However, to ensure that all aspects are covered, the report proposes a generic structure for a DMP that can be modified or adapted according to the needs of each project. This structure is organized in nine sections, with a series of questions to facilitate its drafting. Some examples of these questions are included below:
- Data Managers. Who manages the data? Does the research team include a data expert?
- Data Identification & Description. What is the purpose of the research? What data is used and in what format? How often is it collected?
- Data Organization & Exchange. How is the data managed? Where is it stored? Who has access to it?
- Data Storage & Back-up. What is the strategy for data back-up and storage? How frequently do you do your backups?
- Data Sharing, Standards & Metadata. Are you using a file format that is standard to your field? What tools are required to read the data? Is supporting documentation being generated?
- Data Restrictions. How open will the data be? Is there a plan to protect or anonymise the data if necessary?
- Data Publishing & Licensing. where and how will the data be released, under what licenses?
- Data archiving. How will the data be managed when the project ends to ensure its long-term availability? Will it be published with a Digital Object Identifier (DOI)?
- Costs. what are the estimated costs of managing the data and how have these costs been accounted for?
The report also makes a number of general and practical recommendations which apply to all types of projects and their management plans, such as that free and easily accessible Open Science tools should be used as far as possible or that data generated by the project should be posted on a single website.
Tools and resources
The report ends with a series of tools and resources, such as data repositories or the most important biodiversity data standards.
Data repositories
Funded research projects must store and make available their project data to other users through the main national and international archives and storage services.
The report divides the repositories into two main groups. On the one hand, the general repositories, which are open to all research fields, and on the other hand, the specific repositories, which focus on specific subjects. The following table shows some examples of general research repositories.
You can access these repositories through the following links:
- Data Archiving and Networked Services
- Datahub
- Dataverse
- Dryad Digital Repository
- EUDAT
- Figshare
- Mendeley data
- OpenAire
- Registry of Research Data Repositories
- Zenodo
The report also shows examples of specific repositories in the area of biodiversity such as Arctic Biodiversity Data Service or Dynamic Ecological Information Management System.
Standards and licences
The use of norms and standards by the biodiversity research community greatly enhances the interoperability of published datasets. Some of the main ones can be found on this website.
Licensing, on the other hand, boosts visibility, especially when using attribution licenses. In this regard, the report provides a list of resources to better understand and address licensing and other publication considerations.
As a final conclusion we can establish the capital importance of carrying out a data management plan in the scientific field and its subsequent storage in the corresponding repositories in order to promote the reuse of this information and with it the development of new research that promotes the knowledge of humanity. This report offers the keys needed to develop it step by step in a practical and simple way.