espacios de datos | datos.gob.es

Smart destinations as open data generators: barriers and opportunities

Blog

A Smart Tourism Destination (ITD) is based on a management model based on innovation and the use of advanced technology to optimise the visitor experience and improve the sustainability of the destination, while strengthening the quality of life of residents. The DTI model is based on a series of indicators that allow the state of the tourism destination to be analysed, areas for improvement to be diagnosed and strategic action plans to be developed. This approach, promoted by SEGITTUR (Sociedad Estatal para la Gestión de la Innovación y las Tecnologías Turísticas) and other regional public entities (e.g. the DTI-CV model of the Comunitat Valenciana defined by INVATTUR - Instituto Valenciano de Tecnologías Turísticas), has been consolidated as a key pillar in the digital transformation of tourism. This intensive use of technologies in ITDs has transformed them into true data-generating centres, which - combined with external sources - can be used to optimise decision-making and improve destination management.

Data provenance in an ITD and its use

In an ITD, data are basically generated from two main areas:

Data generated by visitors or tourists: they create a digital footprint as they interact with different technologies. This footprint includes comments, ratings, images, spending records, locations and preferences, which are reflected in mobile apps, social media or booking platforms. In addition, data is generated passively through electronic payment systems or urban mobility systems, as well as traffic measurement devices, among others.
Data generated by the tourist destination: thanks to the sensorisation and implementation of IoT networks (Internet of Things ), destinations collect real-time information on traffic management, energy consumption, environmental quality and use of services (public or private). In addition, the destination generates essential data on its tourism offer, such as updated lists of accommodation or hospitality establishments, places or events of tourist interest and complementary services.

The combination of these data sources in a Intelligent Destination Platform (IDP) such as the one proposed by SEGITTUR, allows ITDs to use them to facilitate a more innovative and experience-oriented management.

Title: Areas of data generation in an ITD

Source: own elaboration

There are numerous examples and good practices in the use of these tourism data, implemented by various European destinations, whose description is documented in the report Study on mastering data for tourism by EU destinations. This report provides a detailed insight into the opportunities that the use of data offers to improve the competitiveness of the tourism ecosystem. Furthermore, this report does not ignore the importance of tourist destinations as data generators, formulating a series of recommendations for public administrations, including the development of a cross-cutting data management plan - i.e. involving not only the area of tourism, but also other areas such as urban planning and the environment-, guaranteeing an integrated approach. This plan should promote the availability of open data, with a special focus on data related to sustainability, accessibility and specialised tourism offer.

Smart destination models and open data

SEGITTUR's DTI model (recently described in its practical guide) establishes as a requirement the creation of an open data portal in tourist destinations to facilitate the publication of data in the place where tourism activity takes place and its access in reusable formats, enabling the development of different products and services. No specific themes are established, but information of interest such as public transport, shops, job offers, cultural agenda or environmental sensors are highlighted. Interesting is the definition of indicators to assess the quality of the portal such as compliance with open data standards, the existence of systems to automate the publication of data or the number of datasets available per 100,000 inhabitants. It is also indicated that new datasets should be added progressively as their usefulness is identified.

It should be noted that in other DTI models, such as INVATTUR's DTI-CV model mentioned above, it is also proposed that destinations should have a tourism open data portal in order to promote tourism innovation.

High-value tourism data

The European Union, through Directive (EU) 2019/1024 on open data and re-use of public sector information and Implementing Regulation (EU) 2023/138, has defined high value datasets in various areas, including tourism within the category of statistical data. These are data on tourism flows in Europe:

Overnight stays in tourist accommodation establishments, at national level, at NUTS 2 level (autonomous communities and cities), NUTS 3 level (provinces) and for some specific cities.
Tourist arrivals and departures, tourist expenditure, hotel occupancy, demand for tourist services, at national level.

It is interesting to note that these and other data have been collected in Dataestur, the data platform for tourism in Spain. Dataestur organises its data in different categories:

Travel and leisure: statistics on tourist arrivals, attraction ratings, museum visits and leisure spending.
Economy: data on employment in the tourism sector, active businesses and spending by international visitors.
Transport: data on mobility, including air traffic, bus and rail transport, roads and ports.
Accommodation: information on hotel occupancy, rural tourism, campsites and tourist accommodation, as well as prices, profitability and hotel satisfaction.
Sustainability: indicators on air quality, water and nature conservation in tourist destinations.
Knowledge: analysis of visitor perception, security, digital connectivity, tourism studies and reports.

Most of these data are collected at provincial level (NUTS 3) and are therefore not published at destination level. In this sense, the Spanish Federation of Municipalities and Provinces (FEMP) proposes 80 datasets to be published openly by the local administration which, in addition, take into account high-value data, bringing them down to the local level. Among all these data sets, the following are explicitly defined as data within the tourism category: cultural agenda, tourist accommodation, census of commercial and leisure premises, tourist flows, tourist attractions and monuments.

Barriers and opportunities in the publication of open data by ITDs

After analysing the current state of data management in the field of tourism, a series of opportunities for tourism destinations as generators of open data are proposed:

Provision of data for internal consumption: tourism data covers multiple themes and is generated in different departments within the local administration, such as tourism, urban planning, mobility, environment or economy. Given this scenario of diversity of sources and decision-makers, working on the publication of data in reusable formats not only facilitates its reuse by external agents, but also optimises its use within the local administration itself, allowing for a more efficient and data-based management.
Fostering innovation in tourism: open data from tourism destinations is an excellent raw material on which to develop intelligent products and services with added value for the sector. This facilitates public-private collaboration, promoting the creation of a technology industry around tourism destinations and the open data they generate.
Facilitating the participation of tourism destinations in data spaces: the publication of open data allows the managing bodies of tourism destinations to join data spaces in a more robust way. On the one hand, having open data facilitates interoperability between actors in the sector. On the other hand, tourism open data initiatives increase the data culture in tourism destinations, boosting the perception of the usefulness of data-driven tourism management.

Despite these clear opportunities, there are a number of barriers that make it difficult for tourism destinations to publish data in open format effectively:

Necessity of sufficient budget and technical resources: the publication of open data requires investments in technological platforms and in the training of specialised teams. This is even more important in the field of tourism, where data are heterogeneous, subject-matter diverse and fragmented, requiring additional efforts in their collection, standardisation and coordinated publication.
Small business dominance in tourism destinations: tourism businesses need to be supported to incorporate the use of open destination data, as well as the development of data-driven solutions tailored to the needs of the destination.
Awareness of the usefulness of open data: there is a risk that open data will be seen as a trend rather than a strategic resource that enables tangible benefits. In this sense, data is perceived as an internal resource rather than an asset that can be shared to multiply its value. There is a lack of clear references and examples of the impact of the reuse of open data in tourist destinations that would allow for a deeper incorporation of a data culture at the tourist destination level.
Difficulty in operationalising data strategies: Tourism destinations have incorporated the publication of open data in their strategic plans, but it is necessary to push for its effective implementation. One of the key issues in this regard is the fear of loss of competitive advantage, as the publication of open data related to a destination's tourism activity could reduce its differentiation from other destinations. Another concern relates to legal and personal data protection aspects, especially in areas such as mobility and tourism consumption.

Conclusions: the future of open data in ITD models

In relation to data management, it is necessary to address aspects that are still not sufficiently developed in the DTI models, such as data exchange at the destination, rather than the mere purchase of information; the transversal integration of data on a local scale, allowing the cross-referencing of information from different areas (urban planning, environment, tourism, etc.).); obtaining a greater level of detail in the data, both in terms of time (specific events) and space (areas or points of interest within destinations), safeguarding privacy; and the development of an effective open data strategy.

Focusing on this last point, ITD strategies should include the publication of open data. To this end, it is a priority to define a data management plan that allows each destination to determine what data is produced, how it can be shared and under what conditions, ensuring that the opening of data does not negatively affect the competitiveness of the destination or conflict with current data protection and privacy legislation.

A key tool in this process is the definition of a catalogue, which makes it possible to organise, prioritise and classify the available data (and their metadata) according to their value and usefulness for the different actors in the tourism ecosystem. This catalogue should enable ITD data to comply with the FAIR principles (Findable, Accessible, Interoperable, Reusable), facilitating their open publication or allowing their integration in data spaces (such as the European tourism data space developed in the DeployTour project). In this context, each identified and catalogued dataset should have two versions:

An open version, accessible to any user and in a reusable format, with an unrestricted licence (i.e. an open dataset).
A version that allows specific agreements for use in data spaces, where the sovereignty and control of the destination is maintained, establishing access restrictions and conditions of use.

Regardless of the approach taken, all published data should comply with the FAIR principles, ensuring that it is findable, accessible, interoperable and reusable, promoting its use in both the public and private sectors and facilitating the development of innovative data-driven solutions in the field of tourism.

Jose Norberto Mazón, Professor of Computer Languages and Systems at the University of Alicante. The contents and views reflected in this publication are the sole responsibility of the author.

25/03/2025

Environmental data spaces: key to the success of the European Green Pact

Blog

The European Green Deal (Green Deal) is the European Union's (EU) sustainable growth strategy, designed to drive a green transition that transforms Europe into a just and prosperous society with a modern and competitive economy. Within this strategy, initiatives such as Target 55 (Fit for 55), which aims to reduce EU emissions by at least 55% by 2030, stand out, and the Nature Restoration Regulation(, which sets binding targets to restore ecosystems, habitats and species.

The European Data Strategy positions the EU as a leader in data-driven economies, promoting fundamental values such as privacy and sustainability. This strategy envisages the creation of data spaces sectoral spaces to encourage the availability and sharing of data, promoting its re-use for the benefit of society and various sectors, including the environment.

This article looks at how environmental data spaces, driven by the European Data Strategy, play a key role in achieving the goals of the European Green Pact by fostering the innovative and collaborative use of data.

Green Pact data space from the European Data Strategy

In this context, the EU is promoting the Green Deal Data Space, designed to support the objectives of the Green Deal through the use of data. This data space will allow sharing data and using its full potential to address key environmental challenges in several areas: preservation of biodiversity, sustainable water management, the fight against climate change and the efficient use of natural resources, among others.

In this regard, the European Data Strategy highlights two initiatives:

On the one hand, the GreenData4all initiative which carries out an update of the INSPIRE directive to enable greater exchange of environmental geospatial data between the public and private sectors, and their effective re-use, including open access to the general public.
On the other hand, the Destination Earth project proposes the creation of a digital twin of the Earth, using, among others, satellite data, which will allow the simulation of scenarios related to climate change, the management of natural resources and the prevention of natural disasters.

Preparatory actions for the development of the Green Pact data space

As part of its strategy for funding preparatory actions for the development of data spaces, the EU is funding the GREAT project (The Green Deal Data Space Foundation and its Community of Practice). This project focuses on laying the foundations for the development of the Green Deal data space through three strategic use cases: climate change mitigation and adaptation, zero pollution and biodiversity. A key aspect of GREAT is the identification and definition of a prioritised set of high-value environmental data (minimum but scalable set). This approach directly connects this project to the concept of high-value data defined in the European Open Data Directive (i.e. data whose re-use generates not only a positive economic impact, but also social and environmental benefits).. The high-value data defined in the Implementing Regulation include data related to Earth observation and the environment, including data obtained from satellites, ground sensors and in situ data.. These packages cover issues such as air quality, climate, emissions, biodiversity, noise, waste and water, all of which are related to the European Green Pact.

Differentiating aspects of the Green Pact data space

At this point, three differentiating aspects of the Green Pact data space can be highlighted.

Firstly, its clearly multi-sectoral nature requires consideration of data from a wide variety of domains, each with their own specific regulatory frameworks and models.
Secondly, its development is deeply linked to the territory, which implies the need to adopt a bottom-up approach (bottom-up) starting from concrete and local scenarios.
Finally, it includes high-value data, which highlights the importance of active involvement of public administrations, as well as the collaboration of the private and third sectors to ensure its success and sustainability.

Therefore, the potential of environmental data will be significantly increased through European data spaces that are multi-sectoral, territorialised and with strong public sector involvement.

Development of environmental data spaces in HORIZON programme

In order to develop environmental data spaces taking into account the above considerations of both the European Data Strategy and the preparatory actions under the Horizon Europe (HORIZON) programme, the EU is funding four projects:

Urban Data Spaces for Green dEal (USAGE).. This project develops solutions to ensure that environmental data at the local level is useful for mitigating the effects of climate change. This includes the development of mechanisms to enable cities to generate data that meets the FAIR principles (Findable, Accessible, Interoperable, Reusable) enabling its use for environmentally informed decision-making.
All Data for Green Deal (AD4GD).. This project aims to propose a set of mechanisms to ensure that biodiversity, water quality and air quality data comply with the FAIR principles. They consider data from a variety of sources (satellite remote sensing, observation networks in situ, IoT-connected sensors, citizen science or socio-economic data).
F.A.I.R. information cube (FAIRiCUBE). The purpose of this project is to create a platform that enables the reuse of biodiversity and climate data through the use of machine learning techniques. The aim is to enable public institutions that currently do not have easy access to these resources to improve their environmental policies and evidence-based decision-making (e.g. for the adaptation of cities to climate change).
Biodiversity Building Blocks for Policy (B-Cubed).. This project aims to transform biodiversity monitoring into an agile process that generates more interoperable data. Biodiversity data from different sources, such as citizen science, museums, herbaria or research, are considered; as well as their consumption through business intelligence models, such as OLAP cubes, for informed decision-making in the generation of adequate public policies to counteract the global biodiversity crisis.

Environmental data spaces and research data

Finally, one source of data that can play a crucial role in achieving the objectives of the European Green Pact is scientific data emanating from research results. In this context, the European Union's European Open Science Cloud (EOSC) initiativeis an essential tool. EOSC is an open, federated digital infrastructure designed to provide the European scientific community with access to high quality scientific data and services, i.e. a true research data space. This initiative aims to facilitate interoperability and data exchange in all fields of research by promoting the adoption of FAIR principles, and its federation with the Green Pact data space is therefore essential.

Conclusions

Environmental data is key to meeting the objectives of the European Green Pact. To encourage the availability and sharing of this data, promoting its re-use, the EU is developing a series of environmental data space projects. Once in place, these data spaces will facilitate more efficient and sustainable management of natural resources, through active collaboration between all stakeholders (both public and private), driving Europe's ecological transition.

Jose Norberto Mazón, Professor of Computer Languages and Systems at the University of Alicante. The contents and views reflected in this publication are the sole responsibility of the author.

05/02/2025

Calls for Sectoral Data Spaces

Noticia

The Ministry for the Digital Transformation and Civil Service, on 17 December, announced the publication of the call for proposals for products and services for data spaces, an initiative that seeks to promote innovation and development in various sectors through financial aid. These grants are designed to support companies and organisations in the implementation of advanced technological solutions, thus promoting competitiveness and digital transformation in Spain.

In addition, on 30 December, the Ministry also launched the second call for proposals for demonstrators and use cases. This call aims to encourage the creation and development of sectoral data spaces, promoting collaboration and the exchange of information between the different actors in the sector.

The Ministry has been conducting promotions through online workshops to inform and prepare stakeholders about the opportunities and benefits of the data space sector. It is expected that these events will continue throughout January, providing further opportunities for stakeholders to inform themselves and participate.

The following material is of interest to you:

Call for demonstrators and use cases

2nd Call for Demonstrators and Use Cases. 65 M.€. Call for proposals December 24.

Data space demonstrators and use cases (2nd call for proposals).
Enquiry mailbox: dcu2.espaciosdedatos@digital.gob.es
Presentations and helpful videos:
- Digital Transformation Ministry Grants Portal -DCU2- Support material.

Products and services

Products and services. 44M.€. Call for proposals December 24. Schedule. BBRR publication: third quarter 2024. Call for applications: fourth quarter 2024 and first quarter 2025. Resolution: first and second quarter 2025. Project execution: third and fourth quarter 2025 + first and second quarter 2026.

Call for proposals for products and services for data spaces.
Consultation mailbox: ps.espaciosdedatos@digital.gob.es
Presentations and helpful videos:
- Digital Transformation Ministry Grants Portal -PSED Grants Portal- Support material.

08/01/2025

Data ecosystem developments in Spain: second half of 2024

Noticia

Researchers and students from various centers have also reported advances resulting from working with data:The last days of the year are always a good time to look back and assess the progress made. If a few weeks ago we took stock of what happened in the Aporta initiative, now it is time to compile the news related to data sharing, open data and the technologies linked to them.

Six months ago, we already made a first collection of milestones in the sector. On this occasion, we will summarise some of the innovations, improvements and achievements of the last half of the year.

Regulating and driving artificial intelligence

La inteligencia artificial (IA) continúa siendo uno de los campos donde cada día se aprecian nuevos avances. Se trata de un sector cuyo auge es relativamente nuevo y que necesita regulación. Por ello, la Unión Europea publicó el pasado julio el Reglamento de inteligencia artificial, una norma que marcará el entorno regulatorio europeo y global. Alineada con Europa, España ya presentó unos meses antes su nueva Estrategia de inteligencia artificial 2024, con el fin de establecer un marco para acelerar el desarrollo y expansión de la IA en España.

Artificial intelligence (AI) continues to be one of the fields where new advances are being made every day. This is a relatively new and booming sector in need of regulation. Therefore, last July, the European Union published the Artificial Intelligence Regulation, a standard that will shape the European and global regulatory environment. Aligned with Europe, Spain had already presented its new Artificial Intelligence Strategy 2024 a few months earlier, with the aim of establishing a framework to accelerate the development and expansion of AI in Spain.

On the other hand, in October, Spain took over the co-presidency of the Open Government Partnership. Its roadmap includes promoting innovative ideas, taking advantage of the opportunities offered by open data and artificial intelligence. As part of the position, Spain will host the next OGP World Summit in Vitoria.

Innovative new data-driven tools

Data drives a host of disruptive technological tools that can generate benefits for all citizens. Some of those launched by public bodies in recent months include:

The Ministry of Transport and Sustainable Mobility has started to use Big Data technology to analyse road traffic and improve investments and road safety.
The Principality of Asturias announces a plan to use Artificial Intelligence to end traffic jams during the summer, through the development of a digital twin.
The Government of Aragon presented a new tourism intelligence system, which uses Big Data and AI to improve decision-making in the sector.
The Region of Murcia has launched “Murcia Business Insight”, a business intelligence application that allows dynamic analysis of data on the region's companies: turnover, employment, location, sector of activity, etc.
The Granada City Council has used Artificial Intelligence to improve sewerage. The aim is to achieve "more efficient" maintenance planning and execution, with on-site data.
The Segovia City Council and Visa have signed a collaboration agreement to develop an online tool with real, aggregated and anonymous data on the spending patterns of foreign Visa cardholders in the capital. This initiative will provide relevant information to help tailor strategies to promote international tourism.

Researchers and students from various centers have also reported advances resulting from working with data:

Researchers from the Center for Genomic Regulation (CRG) in Barcelona, the University of the Basque Country (UPV/EHU), the Donostia International Physics Center (DIPC) and the Fundación Biofísica Bizkaia have trained an algorithm to detect tissue alterations in the early stages and improve cancer diagnosis.
Researchers from the Spanish National Research Council (CSIC) and KIDO Dynamics have launched a project to extract metadata from mobile antennas to understand the flow of people in natural landscapes. The objective is to identify and monitor the impact of tourism.
A student at the University of Valladolid (UVa) has designed a project to improve the management and analysis of forest ecosystems in Spain at the local level, by converting municipal boundaries into a linked open data format. The results are available for re-use.

Advances in data spaces

The Ministry for Digital Transformation and the Civil Service and, specifically, the Secretariat of State for Digitalisation and Artificial Intelligence continues to make progress in the implementation of data spaces, through various actions:

A Plan for the Promotion of Sectoral Data Spaces has been presented to promote secure data sharing.
The development of Data Spaces for Intelligent Urban Infrastructures (EDINT) has been launched. This project, which will be carried out through the Spanish Federation of Municipalities and Provinces (FEMP), contemplates the creation of a multi-sectoral data space that will bring together all the information collected by local entities.
In the field of digitalisation, aid has been launched for the digital transformation of strategic productive sectors through the development of technological products and services for data spaces.

Functionalities that bring data closer to reusers

The open data platforms of the various agencies have also introduced new developments, as new datasets, functionalities, strategies or reports:

The Ministry for Ecological Transition and the Demographic Challenge has launched a new application for viewing the National Air Quality Index (AQI) in real time. It includes health recommendations for the general population and the sensitive population.
The Andalusian Government has published a "Guide for the design of Public Policy Pilot Studies". It proposes a methodology for designing pilot studies and a system for collecting evidence for decision-making.
The Government of Catalonia has initiated steps to implement a new data governance model that will improve relations with citizens and companies.
The Madrid City Council is implementing a new 3D cartography and thermal map. In the Blog IDEE (Spatial Data Infrastructure of Spain) they explain how this 3D model of the capital was created using various data capture technologies.
The Canary Islands Statistics Institute (ISTAC) has published 6,527 thematic maps with labor indicators on the Canary Islands in its open data catalog.
Open Data Initiative and the Democratic Union of Pensioners and Retirees of Spain, with support from the Ministry of Social Rights, Consumption and Agenda 2030, presented the first Data website of the Data Observatory x Seniors. Its aim is to facilitate the analysis of healthy ageing in Spain and strategic decision-making. The Barcelona Initiative also launched a challenge to identify 50 datasets related to healthy ageing, a project supported by the Barcelona Provincial Council.
The Centre for Technological Development and Innovation (CDTI) has presented a dashboard in beta phase with open data in exploitable format.

In addition, work continues to promote the opening up of data from various institutions:

Asedie and the King Juan Carlos University (Madrid) have launched the Open Data Reuse Observatory to promote the reuse of open data. It already has the commitment of the Madrid City Council and they are looking for more institutions to join their Manifesto.
The Cabildo of Tenerife and the University of La Laguna have developed a Sustainable Mobility Strategy in the Macizo de Anaga Biosphere Reserve. The aim is to obtain real-time data in order to take measures adapted to demand.

Data competitions and events to encourage the use of open data

Summer was the time chosen by various public bodies to launch competitions for products and/or services based on open data. This is the case of:

The Community of Madrid held DATAMAD 2024 at the Universidad Rey Juan Carlos de Madrid. The event included a workshop on how to reuse open data and a datathon.
More than 200 students registered for the I Malackathon, organised by the University of Malaga, a competition that awarded projects that used open data to propose solutions for water resource management.
The Junta de Castilla y León held the VIII Open Data Competition, whose winners were announced in November.
The II UniversiData Datathon was also launched. 16 finalists have been selected. The winners will be announced on 13 February 2025.
The Cabildo of Tenerife also organised its I Open Data Competition: Ideas for reuse. They are currently evaluating the applications received. They will later launch their 2nd Open Data Competition: APP development.
The Government of Euskadi held its V Open Data Competition. The finalists in both the Applications and Ideas categories are now known.

Also in these months there have been multiple events, which can be seen online, such as:

Other examples of events that were held but are not available online are the III Congress & XIV Conference of R Users, the Novagob 2024 Public Innovation Congress, DATAGRI 2024 or the Data Governance for Local Entities Conference, among others.

These are just a few examples of the activity carried out during the last six months in the Spanish data ecosystem. We encourage you to share other experiences you know of in the comments or via our email address dinamizacion@datos.gob.es.

30/12/2024

Plan for the Promotion of Sectorial Data Spaces

Noticia

TheMinistry for the Digital Transformation and Civil Service has presented an ambitious Plan for the Promotion of Sectorial Data Spaces. Its objective is to foster innovation and improve competitiveness and added-value in all economic sectors, promoting the deployment of data spaces where data can be securely shared. Thanks to them, companies, and the economy in general, will be able to benefit from the full potential of a European data single market.

The Plan has a 500 million euros budget from the Recovery, Transformation and Resilience Plan, and will be developed in 6 axes and 11 initiatives with a planned duration until 2026.

Data spaces

Data sharing in data spaces offers enormous benefits to all the participating companies, both individually and collectively. These benefits include improved efficiency, cost reduction, increased competitiveness, innovation in business models and better adaptation to regulations. These benefits cannot be achieved by companies in isolation but requires the sharing of data among all the actors involved.

Some examples of these benefits would be:

Figure 1. Impact of data spaces on various sectors.

Some specific initiatives include:

The AgriDataSpace project ensures food quality and safety through full traceability of products.
The Mobility Data Space project improves urban planning and transportation efficiency by integrating mobility data.

Benefits of the Plan for the Promotion of Sectorial Data Spaces

The Plan will offer more than €287 million in grants for the creation and maintenance of data spaces, the development of high-value use cases and the reduction of costs for participating companies when consuming, sharing or providing data. It will also offer up to 44 million euros in grants to the technology industry to facilitate the adaptation of their digital products and services to the needs of data spaces and the entities that participate in them by sharing data and making our industry more competitive in data technologies.

Finally, with a budget of up to 169 million euros, several unique projects of public interest will be developed that will act as enablers for digital transformation focused on data and data spaces in all economic sectors. These enablers will contribute to accelerate the process of deploying use cases and data spaces, as well as stimulate companies to actively share data and obtain the expected benefits. To this end, a network of common infrastructures and data space demonstrators will be developed, a National Reference Center for data spaces will be set up, and the entire non-open public data sets held by public administrations which are of high interest to businesses will be made available to the economic sectors.

Learn more about the Plan and its measures

The set of initiatives to be developed by the Plan is summarized in the following table:

Figure 2. Summary table with the initiatives included in the Plan for the Promotion of Sectorial Data Spaces.

Discover the grants that are currently active, and the planned schedule to benefit from them:

• 2nd Call for Demonstrators and Use Cases. 65 M.€. Call for proposals December 24. Products and services. 44M.€. Call for proposals December 24. Data Spaces Kit. 127M.€. In progress, expected January 25.

More information about data spaces here.

Links of interest

logos of Government of Spain, Ministry for the Digital Transformation and Civil Service, funded by the European Union (NextGenerationEU)

18/11/2024

How SEGITTUR drives smart and sustainable tourism through data sharing

Noticia

Tourism is one of Spain's economic engines. In 2022, it accounted for 11.6% of Gross Domestic Product (GDP), exceeding €155 billion, according to the Instituto Nacional de Estadística (INE). A figure that grew to 188,000 million and 12.8% of GDP in 2023, according to Exceltur, an association of companies in the sector. In addition, Spain is a very popular destination for foreigners, ranking second in the world and growing: by 2024 it is expected to reach a record number of international visitors, reaching 95 million.

In this context, the Secretariat of State for Tourism (SETUR), in line with European policies, is developing actions aimed at creating new technological tools for the Network of Smart Tourist Destinations, through SEGITTUR (Sociedad Mercantil Estatal para la Gestión de la Innovación y las Tecnologías Turísticas), the body in charge of promoting innovation (R&D&I) in this industry. It does this by working with both the public and private sectors, promoting:

Sustainable and more competitive management models.
The management and creation of smart destinations.
The export of Spanish technology to the rest of the world.

These are all activities where data - and the knowledge that can be extracted from it - play a major role. In this post, we will review some of the actions SEGITTUR is carrying out to promote data sharing and openness, as well as its reuse. The aim is to assist not only in decision-making, but also in the development of innovative products and services that will continue to position our country at the forefront of world tourism.

Dataestur, an open data portal

Dataestur is a web space that gathers in a unique environment open data on national tourism. Users can find figures from a variety of public and private information sources.

The data are structured in six categories:

General: international tourist arrivals, tourism expenditure, resident tourism survey, world tourism barometer, broadband coverage data, etc.
Economy: tourism revenues, contribution to GDP, tourism employment (job seekers, unemployment and contracts), etc.
Transport: air passengers, scheduled air capacity, passenger traffic by ports, rail and road, etc.
Accommodation: hotel occupancy, accommodation prices and profitability indicators for the hotel sector, etc.
Sustainability: air quality, nature protection, climate values, water quality in bathing areas, etc.
Knowledge: active listening reports, visitor behaviour and perception, scientific tourism journals, etc.

The data is available for download via API.

Dataestur is part of a more ambitious project in which data analysis is the basis for improving tourist knowledge, through actions with a wide scope, such as those we will see below.

Developing an Intelligent Destination Platform (IDP)

Within the fulfillment of the milestones set by the Next Generation funds, and corresponding to the development of the Digital Transformation Plan for Tourist Destinations, the Secretary of State for Tourism, through SEGITTUR, is developing an Intelligent Destination Platform (PID). It is a platform-node that brings together the supply of tourism services and facilitates the interoperability of public and private operators. Thanks to this platform it will be possible to provide services to integrate and link data from both public and private sources.

Some of the challenges of the Spanish tourism ecosystem to which the IDP responds are:

Encourage the integration and development of the tourism ecosystem (academia, entrepreneurs, business, etc.) around data intelligence and ensure technological alignment, interoperability and common language.
To promote the use of the data economy to improve the generation, aggregation and sharing of knowledge in the Spanish tourism sector, driving its digital transformation.
To contribute to the correct management of tourist flows and tourist hotspots in the citizen space, improving the response to citizens' problems and offering real-time information for tourist management.
Generate a notable impact on tourists, residents and companies, as well as other agents, enhancing the brand "sustainable tourism country" throughout the travel cycle (before, during and after).
Establish a reference framework to agree on targets and metrics to drive sustainability and carbon footprint reduction in the tourism industry, promoting sustainable practices and the integration of clean technologies.

Objectives of the Intelligent Destination Platform (IDP), mentioned above

Figure 1. Objectives of the Intelligent Destination Platform (IDP).

New use cases and methodologies to implement them

To further harmonise data management, up to 25 use cases have been defined that enable different industry verticals to work in a coordinated manner. These verticals include areas such as wine tourism, thermal tourism, beach management, data provider hotels, impact indicators, cruises, sports tourism, etc.

To implement these use cases, a 5-step methodology is followed that seeks to align industry practices with a more structured approach to data:

Identify the public problems to be solved.
Identify what data are needed to be available to be able to solve them.
Modelling these data to define a common nomenclature, definition and relationships.
Define what technology needs to be deployed to be able to capture or generate such data.
Analyse what intervention capacities, both public and private, are needed to solve the problem.

Boosting interoperability through a common ontology and data space

As a result of this definition of the 25 use cases, a ontology of tourism has been created, which they hope will serve as a global reference. The ontology is intended to have a significant impact on the tourism sector, offering a series of benefits:

Interoperability: The ontology is essential to establish a homogeneous data structure and enable global interoperability, facilitating information integration and data exchange between platforms and countries. By providing a common language, definitions and a unified conceptual structure, data can be comparable and usable anywhere in the world. Tourism destinations and the business community can communicate more effectively and agilely, fostering closer collaboration.
Digital transformation: By fostering the development of advanced technologies, such as artificial intelligence, tourism companies, the innovation ecosystem or academia can analyse large volumes of data more efficiently. This is mainly due to the quality of the information available and the systems' better understanding of the context in which they operate.
Tourism competitiveness: Aligned with the previous question, the implementation of this ontology contributes to eliminating inequalities in the use and application of technology within the sector. By facilitating access to advanced digital tools, both public institutions and private companies can make more informed and strategic decisions. This not only raises the quality of the services offered, but also boosts the productivity and competitiveness of the Spanish tourism sector in an increasingly demanding global market.
Tourist experience: Thanks to ontology, it is possible to offer recommendations tailored to the individual preferences of each traveller. This is achieved through more accurate profiling based on demographic and behavioural characteristics as well as specific motivations related to different types of tourism. By personalising offers and services, customer satisfaction before, during and after the trip is improved, and greater loyalty to tourist destinations is fostered.
Governance: The ontology model is designed to evolve and adapt as new use cases emerge in response to changing market demands. SEGITTUR is actively working to establish a governance model that promotes effective collaboration between public and private institutions, as well as with the technology sector.

In addition, to solve complex problems that require the sharing of data from different sources, the Open Innovation Platform (PIA) has been created, a data space that facilitates collaboration between the different actors in the tourism ecosystem, both public and private. This platform enables secure and efficient data sharing, empowering data-driven decision making. The PIA promotes a collaborative environment where open and private data is shared to create joint solutions to address specific industry challenges, such as sustainability, personalisation of the tourism experience or environmental impact management.

Building consensus

SEGITTUR is also carrying out various initiatives to achieve the necessary consensus in the collection, management and analysis of tourism-related data, through collaboration between public and private actors. To this end, the Ente Promotor de la Plataforma Inteligente de Destinoswas created in 2021, which plays a fundamental role in bringing together different actors to coordinate efforts and agree on broad lines and guidelines in the field of tourism data.

In summary, Spain is making progress in the collection, management and analysis of tourism data through coordination between public and private actors, using advanced methodologies and tools such as the creation of ontologies, use cases and collaborative platforms such as PIA that ensure efficient and consensual management of the sector.

All this is not only modernising the Spanish tourism sector, but also laying the foundations for a smarter, more intelligent, connected and efficient future. With its focus on interoperability, digital transformation and personalisation of experiences, Spain is positioned as a leader in tourism innovation, ready to face the technological challenges of tomorrow.

14/11/2024

Progress towards the implementation of a data space at the municipal level

Noticia

The Ministry for Digital Transformation and Public Administration has launched a grant for the development of Data Spaces for Intelligent Urban Infrastructures (EDINT). This project envisages the creation of a multi-sectoral data space that will bring together all the information collected by local authorities. The project will be carried out through the Spanish Federation of Municipalities and Provinces (FEMP) and will receive a subsidy of 13 million euros, as stated in the Official State Gazette published on Wednesday 16 October.

A single point of access to smart urban infrastructure data

Thanks to this action, it will be possible to finance, develop and manage a multisectoral data space that will bring together all the information collected by the different Spanish municipalities in an aggregated and centralized manner. It should be recalled that data spaces enable the voluntary sharing of information in an environment of sovereignty, trust and security, established through integrated governance, organisational, regulatory and technical mechanisms.

EDINT will act as a single neutral point of access to smart city information, enabling companies, researchers and administrations to access information without the need to visit the data infrastructure of each municipality, increasing agility and reducing costs. In addition, it will allow connection with other sectoral data spaces.

The sharing of this data will help to accelerate technological innovation processes in smart city products and services. Businesses and organisations will also be able to use the data for the improvement of processes and efficiency of their activities.

The Spanish Federation of Municipalities and Provinces (FEMP) will implement the project.

The EDINT project will be articulated through the Spanish Federation of Municipalities and Provinces.The FEMP reaches more than 95% of the Spanish population, which gives it a deep and close knowledge of the needs and challenges of data management in Spanish municipalities and provinces.

Among the actions to be carried out are:

Development and implementation of the data infrastructure and platform, which will store data from existing Smart City systems.
Incorporation of local entities and companies interested in accessing the data space.
Development of three use cases on the data space, focusing on the following areas: "smart mobility", "managed cities and territories" and "mapping the economic and social activity of cities and territories".
Definition of the governance schemes that will regulate the operation of the project, guaranteeing the interoperability of the data, as well as the management of the complex network of stakeholders (companies, academic institutions and governmental organisations).
Setting up Centres of Excellence and Data Offices, with physical workspaces. These centres will be responsible for the collection of lessons learned and the development of new use cases.

It is a ongoing and sustainable long-term project that will be open to the participation of new actors, be they data providers or data consumers, at any time.

A project aligned with Europe

This assistance is part of the Recovery, Transformation and Resilience Plan, funded by the European Union-Next Generation EU. The creation of data spaces is envisaged in the European Data Strategy, as a mechanism to establish a common data market to ensure the European Union's leadership in the global data economy. In particular, it aims to achieve the free flow of information for the benefit of businesses, researchers and public administrations.

Moreover, data spaces are a key area of the Digital Spain 2026 Agenda, which is driving, among other issues, the acceleration of the digitalisation processes of the productive fabric. To this end, sectoral and data-intensive digitalisation projects are being developed, especially in strategic economic sectors for the country, such as agri-food, mobility, health, tourism, industry, commerce and energy.

The launch of the EDINT project joins other previously launched initiatives such as funding and development grants for use cases and data space demonstrators, which encourage the promotion of public-private sectoral innovation ecosystems.

Sharing data under conditions of sovereignty, control and security not only allows local governments to improve efficiency and decision-making, but also drives the creation of creative solutions to various urban challenges, such as optimising traffic or improving public services. In this sense, actions such as the Data Spaces for Smart Urban Infrastructures represent a step forward in achieving smarter, more sustainable and efficient cities for all citizens.

22/10/2024

Common European public sector data spaces

Blog

The strong commitment to common data spaces at European level is one of the main axes of the European Data Strategy adopted in 2020. This approach was already announced in that document as a basis, on the one hand, to support public policy momentum and, on the other hand, to facilitate the development of innovative products and services based on data intelligence and machine learning.

However, the availability of large sectoral datasets required, as an unavoidable prerequisite, an appropriate cross-cutting regulatory framework to establish the conditions for feasibility and security from a legal perspective. In this regard, once the reform of the regulation on the re-use of public sector information had been consolidated, with major innovations such as high-value data, the regulation on data governance was approved in 2022 and then, in 2023, the so-called Data Act. With these initiatives already approved and the recent official publication of the Artificial Intelligence Regulation, the promotion of data spaces is of particular importance, especially in the public sector, in order to ensure the availability of sufficient and quality data.

Data spaces: diversity in their configuration and regulation

The European Data Strategy already envisaged the creation of common European data spaces in a number of sectors and areas of public interest, but at the same time did not rule out the launching of new ones. In fact, in recent years, new spaces have been announced, so that the current number has increased significantly, as we shall see below.

The main reason for data spaces is to facilitate the sharing and exchange of reliable and secure data in strategic economic sectors and areas of public interest. Thus, it is not simply a matter of promoting large datasets but, above all, of supporting initiatives that offer data accessibility according to suitable governance models that, ultimately, allow the interoperability of data throughout the European Union on the basis of appropriate technological infrastructures.

Although general characterisations of data spaces can be offered on the basis of a number of common notes, there is a great diversity from a legal perspective in terms of the purposes they pursue, the conditions under which data are shared and, in particular, the subjects involved.

This heterogeneity is also present in spaces related to the public sector, i.e. those in which there is a prominent role for data generated by administrations and other public entities in the exercise of their functions, to which, therefore, the regulation on reuse and open data approved in 2019 is fully applicable.

Which are the European public sector data spaces?

In early 2024, the second version of a European Commission working document was published with the dual objective of providing an updated overview of the European policy framework for data spaces and also identifying European data space initiatives to assess their maturity and the main challenges ahead for each of them.

In particular, as far as public administrations are concerned, four data spaces are envisaged: the legal data space, the public procurement data space, the data space linked to the technical "once only" system in the context of eGovernment and, finally, the security data space for innovation. These are very diverse initiatives which, moreover, present an uneven degree of maturity, so that some have an advanced level of development and solid institutional support, while other cases are only initially sketched out and have considerable effort ahead for their design and implementation.

Let us take a closer look at each of these spaces referred to in the working paper.

1. Legal data space

It is a data space linked to legislation and jurisprudence generated by both the European Union and the Member States. The aim of this initiative is to support the legal professions, public administrations and, in general, to facilitate access to society in order to strengthen the mechanisms of the rule of law. This space has so far been based on two specific initiatives:

One concerning information on officially published legislation, which has been articulated through the European Legislation Identifier-ELI. It is a European standard that facilitates the identification of rules in a stable and easily reusable way as it describes legislation with a set of automatically processable metadata, according to a recommended ontology.
The second concerns decisions taken by judicial bodies, which are made accessible through an European system of unique identifiers called ECLI (European Case Law Identifier) that is assigned to the decisions of both European and national judicial bodies.

These two important initiatives, which facilitate access to and automated processing of legal information, have required a shift from a document-based management model (official gazette, court decisions) to a data-based model. And it is precisely this paradigm shift that has made it possible to offer advanced information services that go beyond the legal and linguistic limits posed by regulatory and linguistic diversity across the European Union.

In any case, while recognising the important progress they represent, there are still important challenges to be faced, such as facilitating access by specific precepts and not by normative documents or, among others, the availability of judicial documents on the basis of the rules they apply and, also, the linking of the rules with their judicial interpretation by the various judicial bodies in all States. In the case of the latter two scenarios, the challenge is even greater, as they would require the automated linking of both identifiers.

2. Public procurement data space

This is undoubtedly one of the areas with the greatest potential impact, given that in the European Union as a whole, it is estimated that public entities spend around two trillion euros (almost 14% of GDP) on the purchase of services, works and supplies. This space is therefore intended not only to facilitate access to the public procurement market across the European Union but also to strengthen transparency and accountability in public procurement spending, which is essential in the fight against corruption and in improving efficiency.

The practical relevance of this space is reinforced by the fact that it has a specific official document that strongly supports the project and sets out a precise roadmap with the objective of ensuring its deployment within a reasonable timeframe. Moreover, despite limitations in its scope of application (there is no provision for extending the publication obligation to contracts below the thresholds set at European level, nor for contract completion notices), it is at a very advanced stage, in particular as regards the availability of a specific ontology which facilitates the accessibility of information and its re-use by reinforcing the conditions for interoperability.

In short, this space is facilitating the automated processing of public procurement data by interconnecting existing datasets, thus providing a more complete picture of public procurement in the European Union as a whole, even though it has been estimated that there are more than 250,000 contracting authorities awarding public contracts.

3. Single Technical System (e-Government)

This new space is intended to support the need that exists in administrative procedures to collect information issued by the administrations of other States, without the interested parties being required to do so directly. It is therefore a matter of automatically and securely gathering the required evidence in a formalised environment based on the direct interconnection between the various public bodies, which will thus act as authentic sources of the required information.

This initiative is linked to the objective of addressing administrative simplification and, in particular, to the implementation of:

Commission Implementing Regulation (EU) 2022/1463 of 5 August 2022 laying down the technical and operational specifications of the technical system for the automated cross-border exchange of evidence and the implementation of the "only once" principle.
Regulation (EU) of the European Parliament and of the Council of 13 March 2024 laying down measures to ensure a high level of public sector interoperability throughout the Union (the Interoperable Europe Regulation), which aims to establish a robust governance structure for interoperability in the public sector.

4. Security data space for innovation

The objective here is to improve law enforcement authorities' access to the data needed to train and validate algorithms with the aim of enhancing the use of artificial intelligence and thus strengthening law enforcement in full respect of ethical and legal standards.

While there is a clear need to facilitate the exchange of data between Member States' law enforcement authorities, the working paper emphasises that this is not a priority for AI strategies in this area, and that the advanced use of data in this area from an innovation perspective is currently relatively low.

In this respect, it is appropriate to highlight the initiative for the development of the Europol sandbox, a project that was sponsored by the decision of the Standing Committee on Operational Cooperation on Internal Security (COSI) to create an isolated space that allows States to develop, train and validate artificial intelligence and machine learning models.

Now that the process of digitisation of public entities is largely consolidated, the main challenge for data spaces in this area is to provide adequate technical, legal and organisational conditions to facilitate data availability and interoperability. In this sense, these data spaces should be taken into account when expanding the list of high-value data, along the lines already advanced by the study published by the European Commission in 2023, which emphasises that the data ets with the greatest potential are those related to government and public administration, justice and legal matters, as well as financial data.

Content prepared by Julián Valero, Professor at the University of Murcia and Coordinator of the "Innovation, Law and Technology" Research Group (iDerTec). The contents and points of view reflected in this publication are the sole responsibility of the author.

10/10/2024

Data Sandboxes: Exploring the potential of open data in a secure environment

Blog

Data sandboxes are tools that provide us with environments to test new data-related practices and technologies, making them powerful instruments for managing and using data securely and effectively. These spaces are very useful in determining whether and under what conditions it is feasible to open the data. Some of the benefits they offer are:

Controlled and secure environments: provide a workspace where information can be explored and its usefulness and quality assessed before committing to wider sharing. This is particularly important in sensitive sectors, where privacy and data security are paramount.
Innovation: they provide a safe space for experimentation and rapid prototyping, allowing for rapid iteration, testing and refining new ideas and data-driven solutions as test bench before launching them to the public.
Multi-sectoral collaboration: facilitate collaboration between diverse actors, including government entities, private companies, academia and civil society. This multi-sectoral approach helps to break down data silos and promotes the sharing of knowledge and good practices across sectors.
Adaptive and scalable use: they can be adjusted to suit different data types, use cases and sectors, making them a versatile tool for a variety of data-driven initiatives.
Cross-border data exchange: they provide a viable solution to manage the challenges of data exchange between different jurisdictions, especially with regard to international privacy regulations.

The report "Data Sandboxes: Managing the Open Data Spectrum" explores the concept of data sandboxes as a tool to strike the right balance between the benefits of open data and the need to protect sensitive information.

Value proposition for innovation

In addition to all the benefits outlined above, data sandboxes also offer a strong value proposition for organisations looking to innovate responsibly. These environments help us to improve data quality by making it easier for users to identify inconsistencies so that improvements can be made. They also contribute to reducing risks by providing secure environments to enable work with sensitive data. By fostering cross-disciplinary experimentation, collaboration and innovation, they contribute to increasing the usability of data and developing a data-driven culture within organisations. In addition, data sandboxes help reduce barriers to data access , improving transparency and accountability, which strengthens citizens' trust and leads to an expansion of data exchanges.

Types of data sandboxes and characteristics

Depending on the main objective when implementing a sandbox, there are three different types of sandboxes:

Regulatory sandboxes, which allow companies and organisations to test innovative services under the close supervision of regulators in a specific sector or area.
Innovation sandboxes, which are frequently used by developers to test new features and get quick feedback on their work.
Research sandboxes, which make it easier for academia and industry to safely test new algorithms or models by focusing on the objective of their tests, without having to worry about breaching established regulations.

In any case, regardless of the type of sandbox we are working with, they are all characterised by the following common key aspects:

Characteristics of a data sandbox: adaptable and scalable, controlled, secure, multi-sectoral and collaborative, high computational environment, temporal in nature. Source: adapted from The Govlab.

Figure 1. Characteristics of a data sandbox. Adaptation of a visual of The Govlab.

Each of these is described below:

Controlled: these are restricted environments where sensitive and analysed data can be accessed securely, ensuring compliance with relevant regulations.
Secure: they protect the privacy and security of data, often using anonymised or synthetic data.
Collaborative: facilitating collaboration between different regions, sectors and roles, strengthening data ecosystems.
High computational capacity: provide advanced computational resources capable of performing complex tasks on the data when needed.
Temporal in nature: They are designed for temporary use and with a short life cycle, allowing for rapid and focused experimentation that either concludes once its objective is achieved or becomes a new long-term project.
Adaptable: They are flexible enough to customise and scale according to needs and different data types, use cases and contexts.

Examples of data sandboxes

Data sandboxes have long been successfully implemented in multiple sectors across Europe and around the world, so we can easily find several examples of their implementation on our continent:

Data science lab in Denmark: it provides access to sensitive administrative data useful for research, fostering innovation under strict data governance policies.
TravelTech in Lithuania: an open access sandbox that provides tourism data to improve business and workforce development in the sector.
INDIGO Open Data Sandbox: it promotes data sharing across sectors to improve social policies, with a focus on creating a secure environment for bilateral data sharing initiatives.
Health data science sandbox in Denmark: a training platform for researchers to practice data analysis using synthetic biomedical data without having to worry about strict regulation.

Future direction and challenges

As we have seen, data sandboxes can be a powerful tool for fostering open data, innovation and collaboration, while ensuring data privacy and security. By providing a controlled environment for experimentation with data, they enable all interested parties to explore new applications and knowledge in a reliable and safe way. Sandboxes can therefore help overcome initial barriers to data access and contribute to fostering a more informed and purposeful use of data, thus promoting the use of data-driven solutions to public policy problems.

However, despite their many benefits, data sandboxes also present a number of implementation challenges. The main problems we might encounter in implementing them include:

Relevance: ensure that the sandbox contains high quality and relevant data, and that it is kept up to date.
Governance: establish clear rules and protocols for data access, use and sharing, as well as monitoring and compliance mechanisms.
Scalability: successfully export the solutions developed within the sandbox and be able to translate them into practical applications in the real world.
Risk management: address comprehensively all risks associated with the re-use of data throughout its lifecycle and without compromising its integrity.

However, as technologies and policies continue to evolve, it is clear that data sandboxes are set to be a useful tool and play an important role in managing the spectrum of data openness, thereby driving the use of data to solve increasingly complex problems. Furthermore, the future of data sandboxes will be influenced by new regulatory frameworks (such as Data Regulations and Data Governance) that reinforce data security and promote data reuse, and by integration with privacy preservation and privacy enhancing technologies that allow us to use data without exposing any sensitive information. Together, these trends will drive more secure data innovation within the environments provided by data sandboxes.

Content prepared by Carlos Iglesias, Open data Researcher and consultant, World Wide Web Foundation. The contents and views expressed in this publication are the sole responsibility of the author.

27/09/2024

IMPaCT-Data, medical data integration to boost precision medicine

Blog

IMPaCT, the Infrastructure for Precision Medicine associated with Science and Technology, is an innovative programme that aims to revolutionise medical care. Coordinated and funded by the Carlos III Health Institute, it aims to boost the effective deployment of personalised precision medicine.

Personalised medicine is a medical approach that recognises that each patient is unique. By analysing the genetic, physiological and lifestyle characteristics of each person, more efficient and safer tailor-made treatments with fewer side effects are developed. Access to this information is also key to making progress in prevention and early detection, as well as in research and medical advances.

IMPaCT consists of 3 strategic axes:

Axis 1 Predictive medicine: COHORTE Programme. It is an epidemiological research project consisting of the development and implementation of a structure for the recruitment of 200,000 people to participate in a prospective study.
Strand 2 Data science: DATA Programme. It is a programme focused on the development of a common, interoperable and integrated system for the collection and analysis of clinical and molecular data. It develops criteria, techniques and best practices for the collection of information from electronic medical records, medical images and genomic data.
Axis 3 Genomic medicine: GENOMICS Programme. It is a cooperative infrastructure for the diagnosis of rare and genetic diseases. Among other issues, it develops standardised procedures for the correct development of genomic analyses and the management of the data obtained, as well as for the standardisation and homogenisation of the information and criteria used.

In addition to these axes, there are two transversal strategic lines: one focused on ethics and scientific integrity and the other on internationalisation, as summarised in the following visual.

Source: IMPaCT-Data

In the following, we will focus on the functioning and results of IMPaCT-Data, the project linked to axis 2.

IMPaCT-Data, an integrated environment for interoperable data analysis

IMPaCT-Data is oriented towards the development and validation of an environment for the integration and joint analysis of clinical, molecular and genetic data, for secondary use, with the ultimate goal of facilitating the effective and coordinated implementation of personalised precision medicine in the National Health System. It is currently made up of a consortium of 45 entities associated by an agreement that runs until 31 December 2025.

Through this programme, the aim is to create a cloud infrastructure for medical data for research, as well as the necessary protocols to coordinate, integrate, manage and analyse such data. To this end, a roadmap with the following technical objectives is followed:

Source: IMPaCT-Data.

Results of IMPaCT-Data

As we can see, this infrastructure, still under development, will provide a virtual research environment for data analysis through a variety of services and products:

IMPaCT-Data Federated Cloud. It includes access to public and access-controlled data, as well as tools and workflows for the analysis of genomic data, medical records and images. At this video shows how federated user access and job execution is realised through the use of shared computational resources. This allows for viewing and accessing the results in HTML and raw format, as well as their metadata. For those who want to go deeper into the user access options, please see this video another video where the linking of institutional accounts to the IMPaCT-Data account and the use of passports and visas for local access to protected data is shown.
Compilation of software tools for the analysis of IMPaCT-Data. These tools are publicly accessible through the iMPaCT-Data domain domain at bio.tools a registry of software components and databases aimed at researchers in the field of biological and biomedical sciences. It includes a wide range of tools. On the one hand, we find general solutions, for example, focused on privacy through actions related to data de-identification and anonymisation (FAIR4Health Data Privacy Tool). On the other hand, there are specific tools, focused on very specific issues, such as gene expression meta-analysis (ImaGEO).
Guidelines with recommendations and good practices for the collection of medical information. There are currently three guides available: "IMPaCT-Data recommendations on data and software", "IMPaCT-Data additional considerations to the IMPaCT 2022 call for projects" and "IMPaCT-Data recommendations on data and software" .

In addition to these, there are a number of deliverables related to technical aspects of the project, such as comparisons of techniques or proofs of concept, as well as scientific publications.

Driving use cases through demonstrators

One of the objectives of IMPaCT-Data is to contribute to the evaluation of technologies associated with the project's developments, through an ecosystem of demonstrators. The aim is to encourage contributions from companies, organisations and academic groups to drive improvements and achieve large-scale implementation of the project.

To meet this objective, different activities are organised where specific components are evaluated in collaboration with members of IMPaCT-Data. One example is the oRBITS terminology server for the encoding of clinical phenotypes into HPO (Human Phenotype Ontology) aimed at automatically extracting and encoding information contained in unstructured clinical reports using natural language processing. It uses the HPO terminology, which aims to standardise the collection of phenotypic data, making it accessible for further analysis.

Another example of demonstrators refers to the sharing of virtualised medical data between different centres for research projects, within a governed, efficient and secure environment, where all data quality standards defined by each entity are met within a governed, efficient and secure environment, where all data quality standards defined by each entity are met.

A strategic project aligned with Europe

IMPaCT-Data fits directly into the National Strategy for the Secondary Use of National Health System Data, as described in the PERTE on health (Strategic Projects for Economic Recovery and Transformation), with its knowledge, experience and input being of great value for the development of the National Health Data Space.

Furthermore, IMPaCT-Data's developments are directly aligned with the guidelines proposed by GAIA-X both at a general level and in the specific health environment.

The impact of the project in Europe is also evidenced by its participation in the european project GDI (Genomic Data Infrastructure) which aims to facilitate access to genomic, phenotypic and clinical data across Europe, where IMPaCT-Data is being used as a tool at national level.

This shows that thanks to IMPaCT-Data it will be possible to promote biomedical research projects not only in Spain, but also in Europe, thus contributing to the improvement of public health and individualised treatment of patients.

20/08/2024