4 examples of harmonisation of datasets

Fecha de la noticia: 10-06-2021

4 ejemplos de proyectos de armonización

In any project related to data, it is common to have different sources of information. Data is key for companies and public administrations, in decision making or as a basis for the implementation of projects, services or products. But if these data sources display information in a heterogeneous way, it is difficult to operate.

In the world of open data, each administration covers a different scope, be it territorial - municipal, provincial, regional or national - or jurisdictional - for example, each ministry deals with data from a specific area: ecological transition, health, mobility, etc. -. To be able to carry out projects that cover several areas, we will need interoperable data. Otherwise, the exchange and integration of data within and between organisations will be incompatible.

Why is data harmonisation important?

Today, public administrations manage large amounts of data in different formats, with different management methods. It is common to host multiple copies in many different repositories. These data are often disseminated in portals across Europe without any harmonisation in terms of content and presentation. This explains the low level of re-use of existing information on citizens and businesses. Harmonisation of information allows for consistent and coherent data in a way that is compatible and comparable, unifying formats, definitions and structures.

This shaping of data can be done individually for each project, but it entails a high cost in terms of time and resources. It is therefore necessary to promote standards that allow us to have already harmonised data. Below are several examples of initiatives that advocate the search for common requirements, which are included in this visual:

 

4 examples of dataset harmonisation projects: Ministry of Transport, Mobility and Urban Agenda, UniversiData, Spanish Federation of Municipalities and Provinces, and Asedie.

Ministry of Transport, Mobility and Urban Agenda

The Ministry of Transport, Mobility and Urban Agenda is working on a National Access Point (PAN, in its Spanish singles) where unified data on different modes of transport is collected. The creation of this portal responds to compliance with Commission Delegated Regulation (EU) 2017/1926, which establishes the obligation for authorities, operators, managers and providers of transport services to provide information on multimodal journeys in the EU, based on a series of specifications that ensure its availability and reliability. Among other issues, it indicates that the content and structure of the relevant travel and traffic data need to be adequately described using appropriate metadata.

The creation of this Single Access Point was published in the Official State Gazette (BOE) on 22 February. The text indicates that the minimum universal traffic information related to road safety will be made public, whenever possible and free of charge, with a special focus on real-time services.

At the moment, the PAN has data from the DGT, the Basque Government, the Generalitat de Catalunya, the Madrid City Council and the company Tomtom.

Spanish Federation of Municipalities and Provinces

The Spanish Federation of Municipalities and Provinces (FEMP, in its Spanish singles) has an open data group that has developed two guides to help municipalities implement open data initiatives. One of them is the proposal of 40 datasets that every administration should open to facilitate the reuse of public sector information. This guide not only seeks uniformity in the categories of data published, but also in the way they are published. A fact sheet has been created for each proposed dataset with information on update frequency, formats or recommended display form.

FEMP's future plans include reviewing the datasets published so far to assess whether to add or remove datasets and to include new practical examples.

Also in the field of cities, there is an initiative to further ground the harmonisation of a limited subset of datasets carried out in the framework of the Ciudades Abiertas project, with the collaboration of Red.es. The city councils participating in the project - A Coruña, Madrid, Santiago de Compostela and Zaragoza - have agreed on the opening of 27 harmonised datasets. Currently, common vocabularies have been developed for 16 of them and work continues on the others.

ASEDIE and its Top 3

In 2019, the Multisectoral Information Association (ASEDIE) launched an initiative for all Autonomous Communities to fully open three sets of data: the databases of cooperatives, associations and foundations. It was also proposed that they should all follow unified criteria to facilitate their reuse, such as the incorporation of the NIF of each of the entities.

The results have been very positive. To date, 15 autonomous communities have opened at least two of the three databases. The database of Associations has been opened by all 17 Autonomous Communities.

In 2020, ASEDIE proposed a new Top 3 and started to promote the opening of new databases: commercial establishments, industrial estates and SAT registers. However, due to the fact that not all Autonomous Regions have a register of commercial establishments (because it is not a regional competence), this dataset has been replaced by the Register of Energy Efficiency Certificates.

UniversiData

UniversiData is a collaborative project to promote open data linked to higher education in Spain in a harmonised way. To date, five universities have joined the project: Universidad Autónoma de Madrid, Universidad Complutense de Madrid, Universidad Rey Juan Carlos, Universidad de Valladolid and Universidad Carlos III de Madrid (UC3M).

Within the framework of the project, the "Common Core" specification has been developed, with the aim of providing answers to two questions that the universities ask themselves when opening their data: What datasets should I publish? And how should I do it? That is to say, with which fields, granularity, formats, encodings, frequency, etc. The Common Core coding has been created in accordance with the Law on Transparency, Access to Public Information and Good Governance. Two University Transparency Rankings have also been considered for its development (that of the Fundación Compromiso y Transparencia and that of Dyntra), as well as the document "Towards an Open University: Recommendations for the S.U.E.", of the Conference of Rectors of Spanish Universities (CRUE).

All these initiatives show how data harmonisation can improve the usefulness of data. If we have unified data, its reuse will be easier, as the time and cost of its analysis and management will be reduced.


Content prepared by the datos.gob.es team.