FAIR Principles: Good Practices for Scientific Data Management and Stewardship

Fecha de la noticia: 23-10-2017

open science, datos científicos, principios FAIR

Our lives are surrounded by data and we are immersed in its culture: open data, big data, linked data… The growth of the capacity to generate, store and process data is never ending and is accompanied by the generalization of the use of technological applications (for example, data created in a specific way by millions of users who use digital services for personal and/or professional reasons, data generated by the progressive “Internet of Things”, or data from scientific research).

In this context, the scientific community, which was already fully committed to Open Science –where the data obtained from experimentation is encouraged to be automatically made publicly available, especially the data produced with public funds-, needed a series of good practices for the publication of scientific data that were clearly specified, widely shared and applied.

This is because the management of academic publications in scientific journals had already been well specified for some time, but the same could not be said for the formal publication of scientific data. And this is taking into account that today, data is beginning to be considered the main product of scientific research, with its publication and reuse being necessary in order to guarantee its validity, its reproducibility, and to lead to new discoveries.

On March 15, 2016 the article “The FAIR Guiding Principles for scientific data management and stewardship” was published in the Scientific Data Journal by Nature. The FAIR Principles offer a set of precise and measurable qualities that a data publication should follow in order to make the data Findable, Accessible, Interoperable and Reusable, as detailed below:

 

FINDABLE: The data and metadata can be found by the community after its publication, using search tools.

             F1. Assign the (meta)data a globally unique and persistent identifier

F2. Describe the data with rich metadata

F3. Register/index the (meta)data in a searchable resource

F4. The metadata should clearly and explicitly include the identifier of the data described.

ACCESSIBLE: (Meta)data are accessible and can therefore be downloaded by other researchers using their identifiers.

A1 (Meta)data are retrievable by their identifiers using a standardized communications protocol

A1.1 The protocols have to be open, free and universally implementable

A1.2 The protocol must allow for an authentication and authorization procedure (where necessary)

A2 The metadata must be accessible, even when the data are no longer available.

INTEROPERABLE: Both the data and the metadata should be described following the rules of the community, using open standards, in order to allow for their exchange and reuse.

I1. (Meta)data must use a formal, accessible, shared and broadly applicable language for knowledge representation

I2. (Meta)data use vocabularies that follow FAIR principles

I3. (Meta)data include qualified references to other (meta)data.

REUSABLE: (Meta)data can be reused by other researchers, since their origin and conditions of reuse are clear.

           R1. (Meta)data have a plurality of accurate and relevant attributes

R1.1. (Meta)data are released with a clear and accessible data usage license

R1.2. (Meta)data are associated with information on their provenance

R1.3. (Meta)data meet domain-relevant community standards.

Open Research Data

The FAIR Principles don’t touch on controversial issues such as technology or the approach used in the implementation. This level of abstraction means that they have already been accepted by several funding organizations of research projects and policy makers.

The interest in applying these principles is reflected in their incorporation into the European Union’s Horizon 2020 Program of Research and Innovation projects.

Initially, during the 2014-2015 period, a pilot test was conducted (called “Open Research Data” –which included data management plans, based on the FAIR principles–) with 7 selected work areas. Subsequently, the number of areas has increased until reaching the current situation, where it is applied to all subject areas of the Horizon 2020 Program, with all projects now being “Open Research Data” by default. If you want to learn more about these principles in detail and how to apply them to digital repositories, we recommend this webinar (in English) “FAIR Data in Trustworthy Data Repositories Webinar (DANS/EUDAT/OpenAIRE)”.

To wrap up, we must emphasize the key principle behind the other FAIR principles: “as open as possible, as closed as necessary” (open by default). We hope that this will be the case and that the Open Data community will continue to grow in several directions thanks to such specific initiatives like the FAIR principles.