Blog

Open data portals are experiencing a significant growth in the number of datasets being published in the transport and mobility category. For example, the EU's open data portal already has almost 48,000 datasets in the transport category or Spain's own portal datos.gob.es, which has around 2,000 datasets if we include those in the public sector category. One of the main reasons for the growth in the publication of transport-related data is the existence of three directives that aim to maximise the re-use of datasets in the area. The PSI directive on the re-use of public sector information in combination with the INSPIRE directive on spatial information infrastructure and the ITS directive on the implementation of intelligent transport systems, together with other legislative developments, make it increasingly difficult to justify keeping transport and mobility data closed.

In this sense, in Spain, Law 37/2007, as amended in November 2021, adds the obligation to publish open data to commercial companies belonging to the institutional public sector that act as airlines. This goes a step further than the more frequent obligations with regard to data on public passenger transport services by rail and road.

In addition, open data is at the heart of smart, connected and environmentally friendly mobility strategies, both in the case of the Spanish "es.movilidad" strategy and in the case of the sustainable mobility strategy proposed by the European Commission. In both cases, open data has been introduced as one of the key innovation vectors in the digital transformation of the sector to contribute to the achievement of the objectives of improving the quality of life of citizens and protecting the environment.

However, much less is said about the importance and necessity of open data during the research phase, which then leads to the innovations we all enjoy. And without this stage in which researchers work to acquire a better understanding of the functioning of the transport and mobility dynamics of which we are all a part, and in which open data plays a fundamental role, it would not be possible to obtain relevant innovations or well-informed public policies. In this sense, we are going to review two very relevant initiatives in which coordinated multi-national efforts are being made in the field of mobility and transport research.

The information and monitoring system for transport research and innovation

At the European level, the EU also strongly supports research and innovation in transport, aware that it needs to adapt to global realities such as climate change and digitalisation. The Strategic Transport Research and Innovation Agenda (STRIA) describes what the EU is doing to accelerate the research and innovation needed to radically change transport by supporting priorities such as electrification, connected and automated transport or smart mobility.

In this sense, the Transport Research and Innovation Monitoring and Information System (TRIMIS) is the tool maintained by the European Commission to provide open access information on research and innovation (R&I) in transport and was launched with the mission to support the formulation of public policies in the field of transport and mobility.

TRIMIS maintains an up-to-date dashboard to visualise data on transport research and innovation and provides an overview and detailed data on the funding and organisations involved in this research. The information can be filtered by the seven STRIA priorities and also includes data on the innovation capacity of the transport sector.

If we look at the geographical distribution of research funds provided by TRIMIS, we see that Spain appears in fifth place, far behind Germany and France. The transport systems in which the greatest effort is being made are road and air transport, beneficiaries of more than half of the total effort.

 
Graph showing the geographical distribution of research funds provided by TRIMIS. The top positions are occupied by: Germany, France, Italy, United Kingdom, Spain, Netherlands and Belgium.

However, we find that in the strategic area of Smart Mobility and Services (SMO), which are evaluated in terms of their contribution to the overall sustainability of the energy and transport system, Spain is leading the research effort at the same level as Germany. It should also be noted that the effort being made in Spain in terms of multimodal transport is higher than in other countries.

Graph showing the distribution of Smart Mobility and Services (SMO) funding. Germany is in first place, closely followed by Spain. This is followed by Italy, France, the United Kingdom, Belgium and the Netherlands.

As an example of the research effort being carried out in Spain, we have the pilot dataset to implement semantic capabilities on traffic incident information related to safety on the Spanish state road network, except for the Basque Country and Catalonia, which is published by the General Directorate of Traffic and which uses an ontology to represent traffic incidents developed by the University of Valencia.

The area of intelligent mobility systems and services aims to contribute to the decarbonisation of the European transport sector and its main priorities include the development of systems that connect urban and rural mobility services and promote modal shift, sustainable land use, travel demand sufficiency and active and light travel modes; the development of mobility data management solutions and public digital infrastructure with fair access or the implementation of intermodality, interoperability and sectoral coupling.

The 100 mobility questions initiative

The 100 Questions Initiative, launched by The Govlab in collaboration with Schmidt Futures, aims to identify the world's 100 most important questions in a number of domains critical to the future of humanity, such as gender, migration or air quality.

One of these domains is dedicated precisely to transport and urban mobility and aims to identify questions where data and data science have great potential to provide answers that will help drive major advances in knowledge and innovation on the most important public dilemmas and the most serious problems that need to be solved.

In accordance with the methodology used, the initiative completed the fourth stage on 28 July, in which the general public voted to decide on the final 10 questions to be addressed. The initial 48 questions were proposed by a group of mobility experts and data scientists and are designed to be data-driven and planned to have a transformative impact on urban mobility policies if they can be solved.

In the next stage, the GovLab working group will identify which datasets could provide answers to the selected questions, some as complex as "where do commuters want to go but really can't and what are the reasons why they can't reach their destination easily?" or "how can we incentivise people to make trips by sustainable modes, such as walking, cycling and/or public transport, rather than personal motor vehicles?"

Other questions relate to the difficulties encountered by reusers and have been frequently highlighted in research articles such as "Open Transport Data for maximising reuse in multimodal route": "How can transport/mobility data collected with devices such as smartphones be shared and made available to researchers, urban planners and policy makers?"

In some cases it is foreseeable that the datasets needed to answer the questions may not be available or may belong to private companies, so an attempt will also be made to define what new datasets should be generated to help fill the gaps identified. The ultimate goal is to provide a clear definition of the data requirements to answer the questions and to facilitate the formation of data collaborations that will contribute to progress towards these answers.

Ultimately, changes in the way we use transport and lifestyles, such as the use of smartphones, mobile web applications and social media, together with the trend towards renting rather than owning a particular mode of transport, have opened up new avenues towards sustainable mobility and enormous possibilities in the analysis and research of the data captured by these applications.

Global initiatives to coordinate research efforts are therefore essential as cities need solid knowledge bases to draw on for effective policy decisions on urban development, clean transport, equal access to economic opportunities and quality of life in urban centres. We must not forget that all this knowledge is also key to proper prioritisation so that we can make the best use of the scarce public resources that are usually available to meet the challenges.


Content written by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization.

The contents and views reflected in this publication are the sole responsibility of the author.

calendar icon
Documentación

This report published by the European Data Portal explores the so-called Citizen Generated Data (CGD). This category of data refers to those generated by citizens. There is a lack of this type of data within European open data portals, mainly due to the lack of publication and management of CGDs by public administrations. 

The document analyzes various open data portals, whose main objective is to provide a vision of the CGDs that can be part of these portals and how to include them by public administrations. It should be noted that during the analysis, a framework is established for the description, reference, and characterization of the CGDs

Finally, based on the conclusions of the previous analysis, the document offers a series of recommendations and guidelines for data publishers. The objective is to increase and improve the presence of CGDs in the publication of open data, involving citizens in the design of policy, processes and governance. 

This report is available at the following link: "Data.europa.eu and citizen-generated data" 

 

 

calendar icon
Noticia

The Data Spaces Business Alliance (DSBA) was born in September 2021, a collaboration of four major organisations with much to contribute to the data economy: the Big Data Value Association (BDVA), FIWARE, Gaia-X and the International Data Spaces Association (IDSA). Its goal: to drive the adoption of data spaces across Europe by leveraging synergies.

How does the DSBA work?

The DSBA brings together diverse actors to realise a data-driven future, where public and private organisations can share data and thus unlock its full value, ensuring sovereignty, interoperability, security and reliability. To achieve this goal, DSBA offers support to organisations, as well as tools, resources and expertise. For example, it is working on the development of a common framework of technology agnostic blocks that are reusable across different domains to ensure the interoperability of different data spaces.

The four founding organisations, BDVA, FIWARE, Gaia-X and IDSA, have a number of international networks of national or regional hubs, with more than 90 initiatives in 34 countries. These initiatives, although very heterogeneous in focus, legal form, level of maturity, etc., have commonalities and great potential to collaborate, complement each other and create impact. Moreover, by operating at local, regional and/or national level, these initiatives provide regular feedback to European associations on the different regional policies, cultures and entrepreneurial ecosystems within the EU.

In addition, DSBA's application has been successful in the European Commission's call for the creation of a Support Centre, which will promote and coordinate actions related to sectoral data spaces. This centre will make available technologies, processes, standards and tools to support the deployment of common data spaces, thus enabling the re-use of data across sectors.

The DSBA hubs

The DSBA hubs refer to the global network combining the existing BDVA, FIWARE, Gaia-X and IDSA initiatives, as shown in the figure below.

Map showing the different organisations that are part of the DSBA

The main characteristics of each of these groups are as follows:

BDVA i-Spaces

BDVA i-Spaces are cross-sector and cross-organisational data incubators and innovation hubs, aimed at accelerating data-driven innovation and artificial intelligence in the public and private sectors. They provide secure experimentation environments, bringing together all the technical and non-technical aspects necessary for organisations, especially SMEs, to rapidly test, pilot and exploit their services, products and applications.

i-Spaces offer access to data sources, data management tools and artificial intelligence technologies, among others. They host closed and open data from corporate and public sources, such as language resources, geospatial data, health data, economic statistics, transport data, weather data, etc. The i-spaces have their own Big Data infrastructure with ad hoc processing power, online storage and state-of-the-art accelerators, all within European borders.

To become an i-Space, organisations must go through an assessment process, using a system of 5 categories, which are ranked according to gold, silver and bronze levels.  These hubs must renew their labels every two years, and these certifications allow them to join a pan-European federation to foster cross-border data innovation, through the EUHubs4Data project.

FIWARE iHubs

FIWARE is an open software community promoted by the ICT industry, which - with the support of the European Commission - provides tools and an innovation ecosystem for entrepreneurs to create new Smart applications and services. FIWARE iHubs are innovation hubs focused on creating communities and collaborative environments that drive the advancement of digital businesses in this area. These centres provide private companies, public administrations, academic institutions and developers with access to knowledge and a worldwide network of suppliers and integrators of this technology, which has also been endorsed by international standardisation bodies.

There are 5 types of iHubs:

  • iHub School: An environment focused on learning FIWARE, from a business and technical perspective, taking advantage of practical use cases.
  • iHub Lab: Laboratory where you can run tests and pilots, as well as obtain FIWARE certifications.
  • iHub Business Mentor: Space to learn how to build a viable business model.
  • iHub Community Creator: Physical meeting point for the local community to bring together all stakeholders, acting as a gateway to the local and global FIWARE ecosystem.

Gaia-X Hubs

The Gaia-X Hubs are the national contact points for the Gaia-X initiative. It should be noted that they are not as such part of Gaia-X AISBL (the European non-profit association), but act as independent think tanks, which cooperate with the association in project deployment, communication tasks, and generation of business requirements for the definition of the architecture of the initiative (as the hubs are close to the industrial projects in each country).

Through them, specific data spaces are developed based on national needs, as well as the identification of funding opportunities to implement Gaia-X services and technology. They also seek to interact with other regions to build transnational data spaces, facilitating the exchange of information and the scaling up of national use cases internationally. To this end, the AISBL provides access to a collaborative platform, as well as support to the respective hubs in the distribution and communication of the use cases.

IDSA Hubs

The IDSA Hubs enable the exchange of knowledge around the reference architecture (known as the IDS-RAM) at country level. By bringing together research organisations, innovation promotion organisations, non-profit organisations, and companies that use IDS concepts and standards in the region, they seek to foster their adoption, and thus promote a sovereign data economy with greater capillarity.

These centres are driven in each country by a university, research organisation, or non-profit entity, working with IDSA to raise awareness of data sovereignty, transfer knowledge, recruit new members, and disseminate IDS-RAM-based use cases. To this end, they develop activities ranging from training sessions to meetings with decision-makers from different public administrations. They also promote and coordinate research and development projects with international organisations and companies, as well as with governments and other public entities.

Conclusion

As we said at the beginning, there is a great potential for synergies between these groups, which should be explored, discussed and articulated in concrete actions and projects. We are facing a promising opportunity to join forces and make further progress in the development and expansion of data spaces, in order to generate a significant impact on the Data Economy.

To stimulate the initial debate, the Data Spaces Business Alliance has prepared the document "Data Spaces Business Alliance Hubs: potential for synergies and impact", which explores the situation described above.

calendar icon
Noticia

The European Directive 2019/1024 on open data and re-use of public sector information emphasises, among many other aspects, the importance of publishing data in real time. In fact, the document talks about dynamic data, which it defines as "documents in digital format, subject to frequent or real-time updates due to their volatility or rapid obsolescence". According to the Directive, public bodies must make this data available for re-use by citizens immediately after collection, through appropriate APIs and, where possible, as a bulk download.

To explore this further, the European Data Portal, Data.europa.eu, has published the report Real-time data 2022: Approaches to integrating real-time data sources in data.europa.eu which analyses the potential of real-time data. It draws on the results of a webinar held by data.europa.eu on 5 April 2022, a recording of which is available on its website.

In addition to detailing the conclusions of the event, the report provides a brief summary of the information and technologies presented at the event, which are useful for real-time data sharing.

The importance of real-time data

The report begins by explaining what real-time data are: data that are frequently updated and delivered immediately after collection, as mentioned above. These data can be of a very heterogeneous nature. The following table gives some examples:

Real-time data examples: 1. Stationary measurements: e.g. time series. 2. Tracking data: e.g. tracking of parcels or cars. 3. Data measured along trajectories: e.g. floating car data. 4. Images: e.g. video streams from cameras, radar data. Source: Report "Real-time data 2022: Approaches to integrating real-time data sources in data.europa.eu", data.europa.eu (2022)

This type of data is widely used to shape applications that report traffic, energy prices, weather forecasts or flows of people in certain spaces. You can find out more about the value of real-time data in this other article.

Real-time data sharing standards

La interoperabilidad es uno de los factores más importantes a tener en cuenta a la hora de seleccionar la tecnología más adecuada para el intercambio de datos en tiempo real. Se precisa un lenguaje común, es decir, formatos de datos comunes e interfaces de acceso a datos que permitan el flujo de datos en tiempo real. Dos estándares que ya son muy utilizados en el ámbito del Internet de las cosas (IoT en sus siglas en inglés) y que pueden ayudar en este sentido son:

SensorThings API (STA)

SensorThings API, from the Open Geospatial Consortium, emerged in 2016 and has been considered a best practice for data sharing in compliance with the INSPIRE Directive.

This standard provides an open and unified framework for encoding and providing access to sensor-generated data streams. It is based on REST and JSON specifications and follows the principles of the OData (OASIS Open Data Protocol) standard.

STA provides common functionalities for creating, reading, updating and deleting sensor resources. It enables the formulation of complex queries tailored to the underlying data model, allowing more direct access to the specific data the user needs. Query options include filtering by time period, observed parameters or resource properties to reduce the volume of data downloaded. It also allows sorting the content of a result by user-specified criteria and provides direct integration with the MQTT standard, which is explained below.

Message Queuing Telemetry Transport (MQTT)

MQTT was invented by Dr. Andy Stanford-Clark of IBM and Arlen Nipper of Arcom (now Eurotech) in 1999. Like STA, it is also an OASIS standard.

The MQTT protocol allows the exchange of messages according to the publish/subscribe principle. The central element of MQTT is the use of brokers, which take incoming messages from publishers and distribute them to all users who have a subscription for that type of data. In this type of environment, data is organised by topics, which are freely defined and allow messages to be grouped into thematic channels to which users subscribe.

The advantages of this system include reduced latency, simplicity and agility, which facilitates its implementation and use in constrained environments (e.g. with limited bandwidth or connectivity).

 In the case of the European portal, users can already find real-time datasets based on MQTT. However, there is not yet a common approach to providing metadata on brokers and the topics they offer, and work is still ongoing.

Other conclusions of the report

As mentioned at the beginning, the webinar on 5 April also served to gather participants' views on the use of real-time data, current challenges in data availability and needs for future improvements. These views are also reflected in this report.

Among the most valued categories of real-time data, users highlighted traffic information and weather data. Data on air pollution, allergens, flood monitoring and stock market information were also mentioned. In this respect, more and more detailed data were requested, especially in the field of mobility and energy in order to be able to compare commodity prices.  Users also highlighted some drawbacks in locating real-time data on the European portal, including the heterogeneity of the information, which requires the use of common standards and formats across countries.

Finally, the report provides a set of recommendations on how to improve the ability to locate real-time data sources through data.europa.eu. To this end, a series of short and medium-term actions have been established, including the collection of use cases, support for data providers and the development of best practices to unify metadata.

You can read the full report here.

calendar icon
Blog

Nowadays we can find a great deal of legislative information on the web. Countries, regions and municipalities make their regulatory and legal texts public through various spaces and official bulletins. The use of this information can be of great use in driving improvements in the sector: from facilitating the location of legal information to the development of chatbots capable of resolving citizens' legal queries.

However, locating, accessing and reusing these documents is often complex, due to differences in legal systems, languages and the different technical systems used to store and manage the data.

To address this challenge, the European Union has a standard for identifying and describing legislation called the European Legislation Identifier (ELI).

What is the European Legislation Identifier?

The ELI emerged in 2012 through Council Conclusions (2012/C 325/02) in which the European Union invited Member States to adopt a standard for the identification and description of legal documents. This initiative has been further developed and enriched by new conclusions published in 2017 (2017/C 441/05) and 2019 (2019/C 360/01).

The ELI, which is based on a voluntary agreement between EU countries, aims to facilitate access, sharing and interconnection of legal information published in national, European and global systems. This facilitates their availability as open datasets, fostering their re-use.

Specifically, the ELI allows:

  • Identify legislative documents, such as regulations or legal resources, uniquely by means of a unique identifier (URI), understandable by both humans and machines.
  • Define the characteristics of each document through automatically processable metadata. To this end, it uses vocabularies defined by means of ontologies agreed and recommended for each field.

Thanks to this, a series of advantages are achieved:

  • It provides higher quality and reliability.
  • It increases efficiency in information flows, reducing time and saving costs.
  • It optimises and speeds up access to legislation from different legal systems by providing information in a uniform manner.
  • It improves the interoperability of legal systems, facilitating cooperation between countries.
  • Facilitates the re-use of legal data as a basis for new value-added services and products that improve the efficiency of the sector.
  • It boosts transparency and accountability of Member States.

Implementation of the ELI in Spain

The ELI is a flexible system that must be adapted to the peculiarities of each territory. In the case of the Spanish legal system, there are various legal and technical aspects that condition its implementation.

One of the main conditioning factors is the plurality of issuers, with regulations at national, regional and local level, each of which has its own means of official publication. In addition, each body publishes documents in the formats it considers appropriate (pdf, html, xml, etc.) and with different metadata. To this must be added linguistic plurality, whereby each bulletin is published in the official languages concerned.

It was therefore agreed that the implementation of the ELI would be carried out in a coordinated manner by all administrations, within the framework of the Sectoral Commission for e-Government (CSAE), in two phases:

  • Due to the complexity of local regulations, in the first phase, it was decided to address only the technical specification applicable to the State and the Autonomous Communities, by agreement of the CSAE of 13 March 2018.
  • In February 2022, a new version was drafted to include local regulations in its application.

With this new specification, the common guidelines for the implementation of the ELI in the Spanish context are established, but respecting the particularities of each body. In other words, it only includes the minimum elements necessary to guarantee the interoperability of the legal information published at all levels of administration, but each body is still allowed to maintain its own official journals, databases, internal processes, etc.

With regard to the temporal scope, bodies have to apply these specifications in the following way:

  • State regulations: apply to those published from 29/12/1978, as well as those published before if they have a consolidated version.
  • Autonomous Community legislation: applies to legislation published on or after 29/12/1978.
  • Local regulations: each entity may apply its own criteria.

How to implement the ELI?

The website https://www.elidata.es/ offers technical resources for the application of the identifier. It explains the contextual model and provides different templates to facilitate its implementation:

It also offers the list of common minimum metadata, among other resources.

In addition, to facilitate national coordination and the sharing of experiences, information on the implementation carried out by the different administrations can also be found on the website.

The ELI is already applied, for example, in the Official State Gazette (BOE). From its website it is possible to access all the regulations in the BOE identified with ELI, distinguishing between state and autonomous community regulations. If we take as a reference a regulation such as Royal Decree-Law 24/2021, which transposed several European directives (including the one on open data and reuse of public sector information), we can see that it includes an ELI permalink.

In short, we are faced with a very useful common mechanism to facilitate the interoperability of legal information, which can promote its reuse not only at a national level, but also at a European level, favouring the creation of the European Union's area of freedom, security and justice.


Content prepared by the datos.gob.es team.

calendar icon
Noticia

Since 2014, the European Commission has been monitoring Member States' digital progress through the annual DESI Digital Economy and Society Index. To do so, it analyses four digital performance indicators: human capital, connectivity, digital technology integration and digital public services.

In this year's edition, Spain is in seventh position, improving two places compared to 2021. It has gone from a score of 57.4% to 60.8%, which represents a growth of almost 6% (the EU average has grown by 3% in the same period). This puts Spain ahead of countries such as Germany, France and Italy. At the head of the EU-27 we find Finland, Denmark and the Netherlands.

Graph showing the position of the different countries in the ranking. The top positions are occupied by: Finland, Denmark, Netherlands, Sweden, Ireland, Malta, Spain,

It should be noted that the DESI 2022 index is based mainly on data from 2021. Overall, during the COVID-19 pandemic, Member States have made progress in their digitisation efforts, thanks in part to the opportunity provided by the resources allocated by Europe through the NextGenerationEU recovery plan. However, there are still general challenges, related to digital skills gaps, the digital transformation of SMEs and the deployment of advanced 5G networks.

Digital progress in Spain

Spain is above the EU average in all four categories analysed:

Graph showing how Spain ranks above the EU average in all four indicators

  • Human capital. Spain improves two positions with respect to 2021 and ranks tenth. It stands out mainly in basic digital skills, while it is only below the EU average in the proportion of information and communication technology (ICT) specialists and graduates. The report highlights that several of the measures outlined in the National Recovery and Resilience Plan aim to boost the acquisition of digital skills, especially for SME employees.
  • Connectivity. Spain is one of the EU leaders in terms of connectivity, where it ranks third for the second year in a row. Our country performs particularly well in very high capacity fixed network coverage (94% compared to 70% of the European average), although it still has room for improvement in 5G coverage. In this regard, strategic reforms and investments are being carried out under the National Recovery and Resilience Plan in order to achieve the Digital Decade connectivity targets and reduce the digital divide between urban and rural areas.

  • Digital technology integration. This is the area where most progress has been made, with an improvement of five positions. Spain is currently in eleventh place. It stands out especially in the percentage of SMEs with a basic level of digital intensity and which use social networks, online sales media and electronic information exchange systems. In the use of artificial intelligence, we are at the European average. On the other hand, technologies such as cloud and Big Data analysis are still not widespread. To improve these capabilities, professionals with digital skills are needed, something that will help to boost the SME Digitalisation Plan 2021-2025.

  • Digital public services. Spain, which has traditionally been a pioneer in this field, is in fifth place, two places above 2021. One of the areas where it performs best is in open data, where it is in third place, well above the European average (95% vs. 81%). In addition, the report highlights how our country is proactively developing new services to respond to the needs of citizens in areas such as health, digital identification, cybersecurity, mobile applications and the integration of AI in the sector. Some examples of projects in which Spain is participating are Genome of Europe and European Self Sovereign identity (ESSIF).

If you would like to go deeper into the analysis of the results of Spain and the other European countries in the DESI index, you can download the reports by country on this website.  In addition, the Spanish e-Government portal provides users with various useful materials, divided by year.

Women in Digital (WiD) Scoreboard

Together with the DESI index, the EU has also published the 2022 edition of the "Women in Digital (WiD) Scoreboard", a report that assesses the digital development of women and their inclusion in areas such as employment and digital entrepreneurship.

In this ranking, Spain is in eighth position, also exceeding the European Union average (64.2% compared to 54.9%). Spanish women stand out especially in terms of Internet use skills, where they are in fourth position compared to European women.

 

All these data show how Spain continues to make progress in digital matters. Although there are still areas for improvement, investment from Spain's Recovery and Resilience Plan is expected to continue to drive progress, mainly in areas such as the digitisation of businesses, strengthening the digital skills of the population, improving digital connectivity and the digitisation of public administrations. All of this without neglecting support for digital-related research and development (R&D).

calendar icon
Blog

Since the publication of Directive (EU) 2019/1024 on open data and re-use of public sector information, the European Commission is undertaking a number of actions to develop the concept of high-value data that this directive introduced as an important novelty in June 2019.

We recall that high-value datasets are defined in this directive as "documents whose re-use is associated with considerable benefits for society, the environment and the economy, in particular because of their suitability for the creation of value-added services, applications and new, decent and quality jobs". The Directive further proposes a first list of six thematic categories of high-value datasets: geospatial, earth observation and environment, meteorology, statistics, corporate and company ownership, and mobility.

In the last three years, numerous initiatives have been launched with the aim of deepening the liberation of this type of datasets and moving towards realising the economic and social benefits derived from their re-use. Studies have been launched such as the “Impact Assessment study on the list of High Value Datasets” by the Commission's DG CONNECT, which presents different options identified for policy-level interventions linked to high-value datasets in the six thematic areas. Or the report “High-value datasets: understanding the perspective of data providers” published by the official European data portal, which aims to understand the perspective of data providers and contains interesting conclusions such as that the perspective is not sufficient to understand where the "high value" actually lies.

A public consultation has also been launched in 2022 to gather public opinion on its draft High-Value Data Act. This draft act already contains a list of specific high-value datasets and provisions for their publication and re-use, which will represent a very significant advance on the objectives of the directive itself. At the end of June, the draft act was also presented to the Committee on Open Data and Re-use of Public Sector Information composed of representatives from EU countries and further progress is expected in September 2022.

For their part, Member States are also carrying out their own work in parallel, as in the case of Spain, which has already started by dedicating the 2019 Aporta Meeting to the promotion of high-value data.

However, the EU's focus on high-value data as a driver of the economy is not unique in the world and there are other initiatives with different degrees of progress and impact that have similar objectives.

Datasets of national interest in Australia

In the case of Australia, a pioneer in this regard, the first National Action Plan of the Australian Open Government Partnership 2016-2018 already contained among its objectives the implementation of actions to develop and publish a framework for high-value datasets and to design how best to facilitate the sharing and use of these datasets through the legislative consultation process.

The Productivity Commission in 2018 recommended recognising a new type of data asset, national interest datasets, defined as datasets that would generate significant benefits for society and would be a special subset of high-value data. At the time, the Australian government committed to appoint a National Data Commissioner, to implement, oversee and regulate a simpler and more efficient data sharing and publishing framework.

However, in 2019 the end-of-term self-assessment report for Australia's first national open government action plan 2016-18 already acknowledged the delay in the initiative. Work was resumed by the National Data Commissioner, who building on previous work continues to conceptualise a framework for identifying high-value data, although no documentation has been released to the general public at this stage.

Aligning open data in Canada

The Canadian Open Government Working Group (COGWG) already started in its 2016-2018 action plan to work on its commitment to align datasets across the country and specifically on the development of a list of priority high-value datasets for collaborative publication across jurisdictions. The plan recognised that publishing common types of data across Canadian jurisdictions would help foster innovation and provide significant socio-economic impact.

In 2018, Canada's Open Government Working Group released an initial list of 17 high-value datasets to be prioritised for publication by federal, provincial, territorial and municipal governments across Canada. This list is part of a report providing common criteria to help identify high-value datasets and is based on work done to unify criteria across levels of government, stakeholder surveys and international standards.

The National Open Government Action Plan 2018-2020 includes a commitment to carry out a pilot project to standardise across jurisdictions five high-value datasets from the list previously identified in the previous plan.

Although the results have not been openly published, the plan's evaluation system acknowledges the delay in meeting this objective as preliminary standards could only be completed for 4 of the 5 high-value datasets. These standards are available through an intranet system to all Canadian public servants (federal, provincial, territorial and municipal), academics and students, as well as to all Canadians by invitation. However, none of the work has been made public nor is it known what datasets they are working on.

India begins work on identifying datasets

More recently, in 2022, the Indian government has published a background note on data accessibility and usage policy in India announcing the development of new policies to improve data access, quality and usage, in line with the technological needs of the next decade.

As with other initiatives in other regions of the world, it recognises the lack of common criteria for consistently identifying and maintaining high-value datasets. It therefore envisages developing a data policy framework that makes data from multiple sources (public and private) accessible through G2G, G2B, B2G and B2B channels.

The objective is also similar to other initiatives, on the one hand, to make public services more efficient, and to enable a new generation of start-ups to drive digital innovation and growth in the Indian economy.

The approaches being followed in different regions of the world to identify and release high-value datasets are very similar and include public consultations, the formation of expert committees, pilot projects and the definition of assessment frameworks. However, we see that development is much slower than expected and that some initiatives, such as those started in Canada or Australia before the EU itself, have not yet been finalised and therefore their impact is not yet known.

For the time being, it seems that the work initiated by the EU is more advanced and, more importantly, more transparent, as the results are being published openly. Let us hope that the initiative does not lose momentum as seems to have happened in Australia or Canada and that we will soon be able to enjoy high-value datasets available for re-use and discuss the impact they have had on society and the European economy.


Content written by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization.

The contents and views reflected in this publication are the sole responsibility of the author.

calendar icon
Noticia

The Tourism Data Space event took place on 9 June, organised by Gaia-X, the European private sector initiative for the creation of an open, federated and interoperable data infrastructure to drive the Data Economy while respecting digital sovereignty. During the event, which was held online, international experts from the public and private sector discussed "How can data spaces contribute to the development of tourism in Europe through citizen-centric offerings?”. The event was a success with more than 250 attendees from 21 countries.

The tourism sector has a strong economic weight in Europe, although it has been affected by the COVID-19 pandemic and the drop in international tourist arrivals, which exceeded 70% worldwide. In this context, Gaia-X and data spaces are positioned as a great opportunity for companies in the sector. Gaia-X aims to make European data available to improve the ability to attract tourists by creating more personalised offers, products and services, resulting in an enhanced experience tailored to customers' needs. It was with this premise in mind that the event kicked off, focusing on the requirements and need for a secure, decentralised and citizen-oriented European tourism data space.

The opening speech of the event was given by Carme Artigas, Secretary of State for Digitalisation and Artificial Intelligence of the Spanish Ministry of Economy and Digital Transformation, who highlighted the importance of the tourism sector: "“Finally, we are giving the tourism the importance it deserves also in the data economy. At the EU level, the tourism sector directly contributes to almost 4% of GDP with 2.3 million businesses, majority of which are SMEs”. This sector also employs 22.4% of the service sector workforce, as Francesco Bonfiglio, Director General of Gaia-X AISBL, commented: "This market is worth billions of euros, and is one of the areas with the greatest impact if we decide to invest in a common European data space".

Artigas also stressed that "Before the end of the year we will have a new digital space for tourism at European level, and this is great news", always respecting the basic principles of data sovereignty, privacy, security and interoperability.

Yvo Volman, Chief Data Officer at DG-CNECT (European Commission), explained that in order to achieve the set objectives, empowerment and data sharing also across sectors is essential. This is the only way to establish better services and promote sustainability. The importance of education was also stressed by Natalia Bayona, Director of Innovation, Education and Investment at the World Tourism Organisation (UNWTO): "Tourism is the main employer of women and young people. However, 50% of people working in tourism have only secondary skills. If we want to develop a high-level economic sector, we have to develop education". In her speech, she also focused on the need for a public-private relationship, with projects such as Gaia-X as a spearhead to drive innovation.

This was followed by several presentations focused on providing an overview of the landscape of the Gaia-X Tourism data space in Europe, with experts from different countries. From Spain, Ana Moniche, Senior Analyst at Turismo Andaluz and NECSTourR, and Cristina Núñez, Director of Necstour, spoke about European regional practices for competitive and sustainable tourism, highlighting how European data sharing is fundamental to develop strategies based on quality information. Data sharing also offers companies with fewer resources the possibility of accessing large amounts of data, which they would not be able to access through their own mechanisms.

Dolores Ordóñez, Director of AnySolution and Vice President of the Spanish Gaia-X Hub, also spoke in this section. In her speech, she highlighted the need for collaboration between companies of different sizes and sectors, especially in four major areas: tourism, health, industry 4.0 and mobility. In the section dedicated to the pillars of tourism data spaces, among other speakers, Alberto Palomo, CDO of the Government of Spain, pointed out the importance of generating scalability in data sharing, as well as the creation of a common framework that shapes governance mechanisms that are useful and accepted by industry players. He also warned that the paradigm we are facing is that of an "innovative decentralised infrastructure", something that all participants must be clear about, because of the cultural change it implies.

To conclude, the event was divided into 3 thematic sessions, designed to create an atmosphere of cross-border collaboration and help create a sustainable data infrastructure for the tourism industry. These sessions focused on smart destinations, the tourism value chain and its technological enablers. More information about the event can be found in the video teaser. This event is part of a series of meetings organised by Gaia-X around data spaces. Two previous events have focused on mobility and health. Gaia-X will continue to hold such activities in the coming months, as can be seen in its calendar. In addition, it has launched a magazine and a podcast series to keep up to date with the latest trends related to the data space.

calendar icon
Blog

The Commission's drive to promote data spaces within the framework of a European Strategy is based on the firm commitment to a regulatory framework that provides regulatory coherence throughout the Union. In particular, the aim is to establish a solid regulation that offers legal certainty to a model based on respect for rights and freedoms. Thus, initially, two initiatives have been promoted to, on the one hand, establish the regulatory bases of the governance model - already definitively adopted by Regulation (EU) 2022/868 of 30 May - and, on the other hand, to establish harmonised rules on the access and fair use of data throughout the Union.

However, while recognising the importance of the design of this general legal architecture, the effective opening and exchange of data requires a more concrete approach that takes into account the specificities of each sectoral area and, in particular, the difficulties and challenges to be faced. Therefore, taking into account the general regulatory framework referred to above, the Commission has presented the first regulatory initiative for one of these areas, related to health data, which is currently under public consultation and negotiation in the Council of the EU and in the European Parliament, and which is part of the project to create a European health data area.

In particular, beyond facilitating the development of cross-border e-services, the proposal aims to address a triple objective:

Establish a uniform legal framework to facilitate the development, marketing and use of electronic health record systems by establishing a compulsory self-certification scheme for certain systems, which in any case provides for some exceptions, e.g. general purpose software used in healthcare environments.

Facilitating patients' electronic access to their own data in the framework of healthcare provision (primary use of health data). In this respect, the proposal seeks to strengthen consistency across Member States in protecting health data irrespective of where the healthcare provision takes place or the type of entity carrying it out.

Encourage the re-use of such data for other secondary purposes. To this end, a specific governance model is envisaged with a specific body at the head - the so-called European Health Data Space Board - and the deployment of duly coordinated state administrative structures - health data access bodies.

We will look at this last point in more detail below.

The promotion of secondary uses

With regard to the re-use of data for purposes other than health care, the proposed regulation is based on the following evidence: although health data are already being collected and processed using electronic means, in many cases, however, access to them is not facilitated to satisfy other purposes of general interest. For this reason, in general, it is intended to establish a broad regulation that facilitates secondary uses of health data. For example, the elaboration of statistics, the development of training and research activities, such as technological innovation -including the training of algorithms- or personalised medicine.

However, for the purposes of denying access to health data, some secondary uses are expressly declared incompatible, such as:

•  The adoption of decisions detrimental to natural persons, meaning not only those that produce legal effects but also those that significantly affect them. In this respect, changes relating to insurance contracts, such as an increase in the amounts to be paid, are specifically highlighted.

• The carrying out of advertising or marketing activities aimed at healthcare professionals, organisations in the sector or natural persons.

•  Making data available to third parties that are not covered by the data permission granted.

• The development of harmful products and services, including in particular illicit drugs, alcoholic beverages, tobacco products or goods or services that contravene public order or morality.

With regard to the parties obliged to share data, in principle the proposed regulation extends to those who collect and process data with public funding, who must make them available to the competent bodies for access to health data in order to facilitate their re-use. However, given their importance in some States, the regulation also extends its scope of application to private parties providing health services - except in the case of micro-enterprises - and also to professional associations. Specifically, this regulation would affect "any natural or legal person, which is an entity or a body in the health or care sector, or performing research in relation to these sectors, as well as Union institutions, bodies, offices and agencies who has the right or obligation, in accordance with this Regulation, applicable Union law or national legislation implementing Union law, or in the case of non-personal data, through control of the technical design of a product and related services, the ability to make available, including to register, provide, restrict access or exchange certain data".

Purpose and conditions of access to health data

The proposed Regulation is based on a broad concept of health data, which includes the following categories: 

Data to be considered in the framework of the European Health Data Space: data provided by patients; data related to health effects (social data, environmental data, etc.); data generated by digital applications; data provided by health systems; data resulting from previous treatments (inferred through tests, automated, etc.). Source: Proposal for a Regulation (EU) on the European Health Data Space.

The regulation is based on a general rule: access to anonymised data as a measure to reduce privacy risks, although a specific regime is also envisaged for personal data. In this case, the request must include an adequate justification and the data will only be provided in pseudonymised form.

As regards the form of access, the particular sensitivity of health data determines that it is proposed that they should be made available through a secure processing environment that complies with the technical and security standards included in the proposal. In particular, the proposal does not allow that, except for non-personal data, the data are transmitted directly to the person who will re-use them.  Furthermore, it provides for processing to take place in secure environments under the control of the access authorities.

Access authorities for health data

From the perspective of the governance model underpinning the proposal, States should have at least one health data access body to provide electronic access to health data for secondary purposes. In the case of multiple bodies due to requirements arising from their political-administrative organisation, one of them will have a coordinating role. Beyond the organisational freedom of the States to choose one or another organisational formula, it is essential that the independence of the coordinating body be guaranteed, without prejudice to the mechanisms of financial or judicial control.

As already indicated, the main purpose of this measure is to ensure a uniform and consistent application of the regulatory framework for access to health data for secondary purposes across the European Union, in particular as regards the protection of personal data in this sector. In this respect, it is proposed that these bodies should be given the powers to verify compliance with these rules and, in particular, to impose sanctions and other measures such as temporary or definitive exclusion from the European Health Data Area of those who do not comply with their obligations.

The harmonisation sought by the proposed Regulation is also envisaged in the establishment of a standardised process for the issuing of permissions to re-use data for secondary purposes. In particular, in cases where anonymised access to the data is not enough, reasons should be given as to why pseudonymised access is necessary. In the latter case, the request must specify the legal basis for requesting access to the data from the perspective of personal data protection law, the secondary purposes for which the data are intended to be re-used, as well as a description of the data and tools necessary for their processing.

Finally, the proposed regulation includes active disclosure obligations addressed to these bodies about the available datasets. This is an essential measure, since the existence of a catalogue of datasets at European level - based on the interconnection of national datasets - would be extremely useful for promoting not only research and innovation but also decision-making at regulatory and political level. Specifically, for each set of available data, the nature of the data, its source and the conditions for making it available will have to be indicated.

In short, this is a certainly innovative initiative to address the regulatory diversity existing in each Member State, which is, however, at an early stage of processing. Precisely, a participation procedure is currently open that allows for the submission of allegations against the initial drafting until 28 July 2022 through a simple procedure accessible via this link.


Content prepared by Julián Valero, Professor at the University of Murcia and Coordinator of the Research Group "Innovation, Law and Technology" (iDerTec).

The contents and views expressed in this publication are the sole responsibility of the author.

calendar icon
Documentación

This report published by the European Data Portal (EDP) aims to help open data users in harnessing the potential of the data generated by the Copernicus program. 

The Copernicus project generates high-value satellite data, generating a large amount of Earth observation data, this is in line with the European Data Portal's objective of increasing the accessibility and value of open data. 

The report addresses the following questions, What can I do with Copernicus data? How can I access the data?, and What tools do I need to use the data? using the information found in the European Data Portal, specialized catalogues and examining practical examples of applications using Copernicus data.  

This report is available at this link: "Copernicus data for the open data community"

 

calendar icon