Entrevista

In this podcast we talk about transport and mobility data, a topic that is very present in our day-to-day lives. Every time we consult an application to find out how long a bus will take, we are taking advantage of open data linked to transport. In the same way, when an administration carries out urban planning or optimises traffic flows, it makes use of mobility data.

To delve into the challenges and opportunities behind the opening of this type of data by Spanish public administrations, we have two exceptional guests:

  • Tania Gullón Muñoz-Repiso, director of the Division of Transport Studies and Technology of the Ministry of Transport and Sustainable Mobility. Welcome, Tania!
  • Alicia González Jiménez, deputy director in the General Subdirectorate of Cartography and Observation of the Territory of the National Geographic Institute.

Listen here the full episode (in Spanish)

Summary of the interview

  1. Both the IGN and the Ministry generate a large amount of data related to transport. Of all of them, can you tell us which data and services are made available to the public as open data?

Alicia González: On the part of the National Geographic Institute, I would say that everything: everything we produce is available to users, because since the end of 2015 the dissemination policy adopted by the General Directorate of the National Geographic Institute, through the Autonomous Organism National Center for Geographic Information (CNIG), which is where all products and services are distributed, is an open data policy, so that everything is distributed under the CC BY 4.0 license, which protects free and open use. You simply have to make an attribution, a mention of the origin of the data. So we are talking, in general, not only about transport, but about all kinds of data, about more than 100 products that represent more than two and a half million files that users are increasingly demanding. In fact, in 2024 we have had up to 20 million files downloaded, so it is in high demand. And specifically in terms of transmission networks, the fundamental set of data is the Geographic Reference Information of Transport Networks (IGR-RT). It is a multimodal geospatial dataset that is composed of five transport networks that are continuous throughout the national territory and also interconnected. Specifically, it contemplates:

1. The road network that is made up of the entire road network, regardless of its owner and that runs throughout the territory. There are more than 300 thousand kilometers of road that are also connected to all the street maps, to the urban road network of all population centers. That is, we have a road graph that backbones the entire territory, in addition to having connected the roads that are later distributed and disseminated in the National Topographic Map.

2. The second most important network is the rail transport network. It includes all the data of rail transport and also of metro, tram and other types of modes by rail.

3 and 4. In the maritime and air field, the networks are already limited to infrastructures, so that they contain all the ports on the Spanish coast and all the infrastructures of aerodromes, airports, heliports in the air part.

5. And finally, the last network, which is much more modest, is residual data: cable transport.

Everything is interconnected through intermodal relationships. It is a set of data that is generated from official sources. We cannot incorporate just any data, it must always be official data and it is generated within the framework of cooperation of the National Cartographic System.

As a dataset that complies with the INSPIRE Directive both in its definition and in the way it is disseminated through standard web services, it has also been classified as a high-value dataset in the mobility category, in accordance with the  High-Value Data Enforcement Regulation. It is a fairly important and normalized set.

How can it be located and accessed? Precisely, as it is standard, it is catalogued in the IDE  (Spatial Data Infrastructure) catalogue, thanks to the standard description of its metadata. It can also be located through the official INSPIRE (Information Publication Services) data and services catalog  or is accessible through portals as relevant as the open data portal.

Once we have located it, how can the user access it? How can they see the data? There are several ways. The easiest: check your visualizer. All the data is displayed there and there are certain query tools to facilitate its use. And then, of course, through the CNIG download centre. There we publish all the data from all the networks and it is in great demand. And then the last way is to consult the standard web services that we generatevisualization services and downloads  of different technologies. In other words, it is a set of data that is available to users for reuse.

Tania Gullón: In the Ministry we also share a lot of open data. I would like, in order not to take too long, to comment in particular on four large sets of data:

1. The first would be the OTLE, the Observatory of Transport and Logistics in Spain, which is an initiative of the Ministry of Transport, whose main objective is to provide a global and comprehensive vision of the situation of transport and logistics in Spain. It is organized into seven blocks: mobility, socio-economy, infrastructure, security, sustainability, metropolitan transport and logistics. These are not georeferenced data, but statistical data. The Observatory makes data, graphs, maps, indicators available to the public and, not only that, but also offers annual reports, monographs, conferences, etc. And also of the observatories that we have cross-border, which are done collaboratively with Portugal and France.

2. The second set of data I want to mention is the NAP, the National Multimodal Transport Access Point, which is an official digital platform managed by the Ministry of Transport, but which is developed collaboratively between the different administrations. Its objective is to centralise and publish all the digitised information on the passenger transport offer in the national territory of all modes of transport. What do we have here? All schedules, services, routes, stops of all transport services, road transport, urban, intercity, rural, discretionary buses on demand. There are 116 datasets. The one of rail transport, the schedules of all those trains, their stops, etc. Also of maritime transport and air transport. And this data is constantly updated in real time. To date, we only have static data in GTFS (General Transit Feed Specification) format, which can also be reused and in a standard format that is useful for the further development of mobility applications by reusers. And while this NAP initially focused on static data, such as those routes, schedules, and stops, progress is being made toward incorporating dynamic data as well. In fact, in December we also have an obligation under European regulations that oblige us to have this data in real time to, in the end, improve all that transport planning and the user experience.

3. The third dataset is Hermes. It is the geographic information system of the general interest transport network. What is its objective? To offer a comprehensive vision, in this case georeferenced. Here I want to refer to what my colleague Alicia has commented, so that you can see how we are all collaborating with each other. We are not inventing anything, but everything is projected on those axes of the roads, for example, RT, the geographical reference information of the transport network. And what is done is to add all these technical parameters, as an added value to have a complete, comprehensive, multimodal information system for roads, railways, ports, airports, railway terminals and also waterways. It is a GIS (Geographic Information System), which allows all this analysis, not only downloading, consulting, with those open web services that we put at the service of citizens, but also in an open data catalog made with CKAN, which I will comment on later. Well, in the end there are more than 300 parameters that can be consulted. What are we talking about? For each section of road, the average traffic intensity, the average speed, the capacity of the infrastructures, planned actions are also known -not only the network in service, but also the planned network, the actions that the Ministry plans to carry out-, the ownership of the road, the lengths, speeds, accidents... well, many parameters, modes of access, co-financed projects, alternative fuels issues, the trans-European transport network, etc. That's the third of the datasets.

4. The fourth set is perhaps the largest because it is 16 GB per day. This is the project we call Big Data Mobility. This project is a pioneering initiative that uses Big Data and artificial intelligence technologies to analyze in depth the mobility patterns in the country is mainly based on the analysis of the anonymized mobile phone records of the population to obtain detailed information on all the movements of people not individualized, but aggregated at the census district level. Since 2020, a daily mobility study has been carried out and all this data is given openly. That is mobility by hours, by origin / destination that allows us to monitor and evaluate the demand for transport to plan improvements in those infrastructures and services. In addition, as data is given in open space, it can be used for any purpose, for tourism purposes, for research...

  1. How is this data generated and collected? What challenges do you have to face in this process and how do you solve them?

Alicia González: Specifically, in the field of products that are technologically generated in geographic information system environments and geospatial databases, in the end these are projects in which the fundamental basis is the capture of data and the integration of existing reference sources. When we see that the headline has a piece of information, that is the one that must be integrated. In summary, in the main technical works, the following could be identified:

  • On the one hand, capture, that is, when we want to store a geographical object we have to digitize it, draw it. Where? On an appropriate metric basis such as the aerial orthophotographs of the National Plan of Aerial Orthophotography (PNOA), which is also another dataset that is available and open. Well, when we have, for example, to draw or digitize a road, we trace it on that aerial image that PNOA provides us.
  • Once we have captured that geometric component, we have to provide it with an attribution and not just any data will do, they have to be official sources. So, we have to locate who is the owner of that infrastructure or who is the provider of the official data to detect what the attributes are, the characterization that we want to give to that information, which in principle was only geometric. To do this, we have to carry out a series of source validation processes, detect that it does not have incidents and processes that we call integration, which are quite complex to guarantee that the result meets what we want.
  • And finally, a fundamental phase in all these projects is the assurance of geometric and semantic quality. In other words, a series of quality controls must be developed and executed to validate the product, the final result of that integration and confirm that it meets the requirements indicated in the product specification.

In terms  of challenges, a fundamental challenge is data governance, that is, the result that is generated is fed from certain sources, but in the end the result is created. Then you have to define the role of each provider that may later later be a user. Another challenge in this whole process is locating data providers. Sometimes the person responsible for that infrastructure or the object that we want to store in the database does not publish the information in a standardized way or it is difficult to locate because it is not in a catalog. Sometimes it is difficult to locate the official source you need to complete the geographical information. And looking a little at the user, I would highlight that another challenge is to identify, to have the agility to identify in a flexible and fast way the use cases that are changing with users, who are demanding us, because in the end it is about continuing to be relevant to society. Finally, and because the Geographic Institute is a scientific and technical environment and this part affects us a lot, another challenge is digital transformation, that is, we are working on technological projects, so we also have to have a lot of capacity to manage change and adapt to new technologies.

Tania Gullón: Regarding how data is generated and collected and the challenges we face, for example, the NAP, the National Access Point for Multimodal Transport, is a collaborative generation, that is, here the data comes from the autonomous communities themselves, from the consortia and from the transport companies. The challenge is that there are many autonomous communities that are not yet digitized, there are many companies... The digitalisation of the sector is going slowly – it is going, but it is going slowly. In the end there is incomplete data, duplicate data. Governance is not yet well defined. It happens to us that, imagine, the company ALSA raises all its buses, but it has buses in all the autonomous communities. And if at the same time the autonomous community uploads its data, that data is duplicated. It's as simple as that. It is true that we are just starting and that governance is not yet well defined, so that there is no excess data. Before they were missing and now there are almost too many.

In Hermes, the geographic information system, what is done, as I said, is to project it on the information of the transport networks, which is the official one that Alicia mentioned, and data from the different managers and administrators of infrastructures are integrated, such as Adif, Puertos del Estado, AENA, the General Directorate of Roads,  ENAIRE, etc. What is the main challenge - if you had to stand out, because we can talk about this for an hour? It has cost us a lot, we have been working on this project for seven years and it has cost a lot because, first, people did not believe it. They didn't think it was going to work and they didn't collaborate. In the end, all this is knocking on the door of Adif, of AENA and changing that awareness in which data cannot be in a drawer, but must all be put at the service of the common good. And I think that's what has cost us a little more. In addition, there is the issue of governance, which Alicia has already commented on. You go to ask for a piece of information and in the organization itself they do not know who is the owner of that data, because perhaps the traffic data is handled by different departments. And who owns it? All this is very important.

We have to say that Hermes has been the great promoter of the Data offices, of the Adif Data office. In the end they have realized that what they needed was to put their house in order, as well as in everyone's house and in the Ministry as well, that Data offices are needed.

In the Big Data project, how is the data generated? In this case it is completely different. It is a pioneering project, more of new technologies, in which data is generated from anonymized mobile phone records. So, by reconstructing all that large amount of Big Data data, of the records that are in each antenna in Spain, with artificial intelligence and with a series of algorithms, these matrices are reconstructed and made. Then, those data from that sample – in the end we have a sample of 30% of the population, of more than 13 million mobile lines – is extrapolated with open data from the INE. And then, what do we do as well? It is calibrated with external sources, that is, with sources of certain reference, such as AENA ticketing, flights, Renfe data, etc. We calibrate this model to be able to generate these matrices with quality. The challenges: that it is very experimental. To give you an idea, we are the only country that has all this data. So we have been opening a gap and learning along the way. The difficulty is, again, the data. That data to calibrate, it is difficult for us to find it and to be given it with a certain periodicity and so on, because this goes in real time and we permanently need that flow of data. Also the adaptation to the user, as Alicia has said. We must adapt to what society and the reusers of this Big Data are demanding. And  we must also keep pace, as Alicia said, with technology, which is not the same as the telephony data that exists now as it was two years ago. And the great challenge of quality control. But well, here I think I'm going to leave Alicia, who is the super expert, to explain to us what mechanisms exist to ensure that the data are reliable and updated and comparable. And then I will give you my vision, if you like.

Alicia González: How can reliability, updating and comparison be guaranteed? I don't know if reliability can be guaranteed, but I think there may be a couple of indicators that are especially relevant. One is the degree to which a set of data conforms to the regulations that concern it. In the field of geographic information, the way of working is always standardized, that is, there is a family of ISO 19100 on Geographic Information/Geomatics or the INSPIRE Directive itself, which greatly conditions the way of working and publishing data. And also, looking at the public administration, I think that the official seal should also be a guarantee of reliability. In other words, when we process data we must do so in a homogeneous and unbiased way, while perhaps, perhaps, a private company may be conditioned by them. I believe that these two parameters are important, that they can indicate reliability.

In terms of the degree of updating and comparison of the data, I believe that the user deduces this information from the metadata. The metadata at the end is the cover letter for the datasets. So, if a dataset is correctly and truthfully metadatad, and if it is also made according to standard profiles – the same in the GEO field, since we are talking about the INPIRE or GeoDCAT-AP profile – if different datasets are defined in their metadata according to these standardized profiles, it is much easier to see if they are comparable and the user can determine and decide if it finally satisfies their update and comparability with another dataset. 

Tania Gullón: Totally Alicia. And if you allow me to add, we, for example, in Big Data have always been very committed to measuring quality – more so when they are new technologies that, at first, people did not trust the results that come out of all this. Always trying to measure this quality - which, in this case, is very difficult because they are large data sets - from the beginning we started designing processes that take time. The daily quality control process of the data takes seven hours, but it is true that at the beginning we had to detect if an antenna had fallen, if something had happened... Then we do a control with statistical parameters and other internal consistency and what we detect here are the anomalies. What we are seeing is that 90% of the anomalies that come out are real mobility anomalies. In other words, there are no errors in the data, but they are anomalies: there has been a demonstration or there has been a football match. These are issues that distort mobility. Or there's been a storm or a rain or anything like that. And it is important not only to control that quality and see if there are anomalies, but we also believe that it is very important  to publish those quality criteria: how we are measuring quality and above all the results. Not only do we give the data on a daily basis, but we also give this metadata, which Alicia says, of quality, of what the sample was like that day, of those values that have been obtained from anomalies. This also occurs in the open: not only the data, but the metadata. And then we also publish the anomalies and the reason for those errors. When errors are found we say "okay, there has been an anomaly because in the town - I don't know what to imagine, it is all of Spain - del Casar was the festival of the Casar cake". And that's it, the anomaly has been found and it is published.

And how do you measure another quality parameter: thematic accuracy? In this case, comparing with sources of true reference. We know that evolution with respect to itself is already very controlled with that internal logical consistency, but we also have to compare it with what happens in the real world. I talked about it before with Alicia, we said "the data is reliable, but what is the reality of mobility? Who knows her?" In the end we have some clues, such as in the tickets of how many have boarded the buses. If we have that data, we have a clue, but of the people who walk and the people who take their cars and so on, what is the reality? It is very difficult to have a point of comparison, but we do compare it with all the data from AENA, Renfe, bus concessions and all these controls are passed to determine how far we deviate from that reality that we can know.

  1. All this data serves as a basis for developing applications and solutions, but it is also essential when it comes to making decisions and accelerating the implementation of the central axes, for example, the Safe, Sustainable and Connected Mobility Strategy or the Sustainable Mobility Bill. How is this data used to make these real decisions?

Tania Gullón: If you will allow me, I would first like to introduce this strategy and the Law on data for those who do not know it. One of the axes, axis 5 of the Ministry's Safe, Sustainable and Connected Mobility Strategy 2030 is "Smart Mobility". And it is precisely focused on this and its main objective  is to promote digitalisation, innovation and the use of advanced technologies to improve efficiency, sustainability and user experience in Spain's transport system. And precisely one of the measures of this axis is the "facilitation of Mobility as a Service, Open Data and New Technologies". In other words, this is where all these projects that we are commenting on are framed. In fact, one submeasure is to promote the publication of open mobility data, another is to carry out analysis of mobility flows and another of the measures, the last, is the creation of an integrated mobility data space. I would like to emphasize - and here I am already in line with this Bill that we hope we will soon see approved - that the Law, in Article 89, regulates the National Access Point, which we also see how it is included in this legislative instrument. And then the Law establishes a key digital instrument for the National Sustainable Mobility System: look at the importance given to the data that in a mobility law it is written that this integrated mobility data space is a key digital instrument. This data space is a reliable data sharing ecosystem, materialized as the digital infrastructure managed by the Ministry of Transport and in coordination with SEDIA (the Secretary of State for Digitalization and Artificial Intelligence), whose objective is to centralize and structure the information on mobility generated by public administrations, transport operators, infrastructure managers,  etc. and guarantee that open access to all this data for all administrations under regulatory conditions.

Alicia González: In this case, I want to say that any objective decision-making, of course, has to be made based on data that, as we said before, has to be reliable, up-to-date and comparable. In this sense, it should be noted that the IGN, the fundamental support it offers to the Ministry for the deployment of the Safe, Sustainable and Connected Mobility Strategy, is the provision of service data and complex analysis of geospatial information. Many of them, of course, about the set of data that we have been talking about transport networks.

In this sense, we would like to mention as an example the accessibility maps with which we contribute to axis 1 of the "Mobility for all" strategy, in which, through the Rural Mobility Table, the IGN was asked if we could generate maps that represented the cost in time and distance that it costs any citizen. Living in any population centre, access to the nearest transport infrastructure, starting with the road network. In other words, how much it costs a user in terms of effort, time and distance, to access the nearest motorway or dual carriageway from their home and then, by extension, to any road in the basic network. We did that analysis - so I said that this network is the backbone of the entire territory, it is continuous - and we finally published those results via the web. They are also open data, any user can consult them and, in addition, we also offer them not only numerically, but also represented in different types of maps. In the end, this geolocated visibility of the result provides fundamental value and facilitates, of course, strategic decision-making in terms of infrastructure planning.

Another example to highlight that is possible thanks to the availability of open data is the calculation of monitoring indicators of the Sustainable Development Goals of the 2030 Agenda. Currently, in collaboration with the National Institute of Statistics, we are working on the calculation of several of them, including one directly associated with Transport, which seeks to monitor goal 11, which is to make cities more inclusive, safe, resilient and sustainable.

  1. Speaking of this data-based decision-making, there is also cooperation at the level of data generation and reuse between different public administrations. Can you tell us about any examples of a project?

Tania Gullón: I also answer that to data-based decision-making, which I have previously beaten around the bush with the issue of the Law. It can also be said that all this Big Data data, Hermes and everything we have discussed is favouring this shift of the Ministry and organisations towards data-based organisations, which means that decisions are based on that analysis of objective data. When you ask like that for an example, I have so many that I wouldn't know what to tell you. In the case of Big Data data, it has been used for infrastructure planning for a few years now. Before, it was done with surveys and it was sized because how many lanes do I put on a road? Or something very basic, how often do we need on a train? Well, if you don't have data on what the demand is going to be, you can't plan it. This is done with Big Data data, not only by the Ministry but, as it is open, it is used by all administrations, all city councils and all infrastructure managers. Knowing the mobility needs of the population allows us to adapt our infrastructures and our services to these real needs. For example, commuter services in Galicia are now being studied. Or imagine the burying of the A-5. They are also used for emergencies, which we have not commented on, but they are also key. We always realize that when there is an emergency, suddenly everyone thinks "data, where is data, where is the open data?", because they have been fundamental. I can tell you, in the case of the Dana, which is perhaps the most recent, several commuter train lines were seriously affected, the tracks were destroyed, and 99% of the vehicles of the people who lived in Paiporta, in Torrent, in the entire affected area, were disabled. And 1% was because he was not in the Dana area at the time. So mobility had to be restored as soon as possible, because thanks to this open data in a week there were buses doing alternative transport services that had been planned with Big Data data. In other words, look at the impact on the population.

Speaking of emergencies, this project was born precisely because of an emergency, because of COVID. In other words, the study, this Big Data, was born in 2020 because the Presidency of the Government was in charge of monitoring this mobility on a daily basis and giving it openly. And here I link with that collaboration between administrations, organizations, companies, universities. Because look, these mobility data fed the epidemiological models. Here we work with the Carlos III Institute, with the Barcelona Supercomputing Center, with these institutes and research centers that were beginning to size hospital beds for the second wave. When we were still in the first wave, we didn't even know what a wave was and they were already telling us "be careful, because there is going to be a second wave, and with this mobility data and so on we will be able to measure how many beds are going to be needed, according to the epidemiological model". Look at the important reuse. We know that this data, for example, from Big Data is being used by thousands of companies, administrations, research centers, researchers around the world. In addition, we receive inquiries from Germany, from all countries, because in Spain we are a bit of a pioneer in this matter of giving all the data openly. We are there creating a school and not only for transportation, but for tourism issues as well, for example.

Alicia González: In the field of geographic information, at the level of cooperation, we have a specific instrument that is the National Cartographic System, which directly promotes coordination in the actions of the different administrations in terms of geographic information. We do not know how to work in any other way than by cooperating. And a clear example is the same set we have been talking about: the set of geographic reference information on transport networks is the result of this cooperation. That is to say, at the national level it is promoted and promoted by the Geographic Institute, but in its updating, regional cartographic agencies with different ranges of collaboration also participate in its production. The maximum is even reached for co-production of data from certain subsets in certain areas. In addition, one of the characteristics of this product is that it is generated from official data from other sources. In other words, there is already collaboration there no matter what. There is cooperation because there is an integration of data, because in the end it has to be filled in with the official data. And to begin with, perhaps it is data provided by INE, the Cadastre, the cartographic agencies themselves, the local street maps... But, once the result has been formed, as I mentioned before, the result has an added value that is of interest to the original supplier itself. For example, this dataset is reused internally, at home, in the IGN: any product or service that requires transport information is fed into this dataset. There is an internal reuse there, but also in the field of public administrations, at all levels. In the state sector, for example, in the Cadastre, once the result has been generated, it is of interest to them for studies to analyse the delimitation of the public domain associated with infrastructures, for example. Or the Ministry itself, as Tania commented before. Hermes was generated from RT data processing, from transport network data. The Directorate-General for Roads uses transport networks in its internal management to make a traffic map, its catalogue management, etc. And in the autonomous communities themselves, the result generated is also useful to them in cartographic agencies or even at the local level. Then there is a continuous cyclical reuse, as it should be, in the end everything is public money and it has to be reused as much as possible. And in the private sphere, it is also reused and value-added services are generated from this data that are provided in multiple use cases. Not to go on too long, simply that: we participate by providing data on which value-added services are generated.

  1. And finally, you can briefly recap some ideas that highlight the impact on daily life and the commercial potential of this data for reusers.

Alicia González: Very briefly, I think that the fundamental impact on everyday life is that the distribution of open data has made it possible to democratize access to data for everyone, for companies, but also for citizens; and, above all, I think it has been fundamental in the academic field, where surely, currently,  it is easier to develop certain investigations that in other times were more complex. And another impact on daily life is the institutional transparency that this implies. And as for the commercial potential of reusers, I reiterate the previous idea: the availability of data drives innovation and the increase of value-added solutions. In this sense, looking at one of the conclusions of the report that was carried out in 2024 by ASEDIE; the Association of Infomedia Companies, on the impact that the geospatial data published by the CNIG had on the private sector, there were a couple of quite important conclusions. One of them said that every time a new set of data is released, reusers are incentivized to generate value-added solutions and, in addition, it allows them to focus their efforts on this development of innovation and not so much on data capture. And it was also clear from that report that since the adoption of the open data policy that I mentioned at the beginning, which was adopted in 2015 by the IGN, 75% of the companies surveyed responded that they had been able to significantly expand the catalogue of products and services based on this open data. Then, I believe that the impact is ultimately enriching for society as a whole.

Tania Gullón: I subscribe to all of Alicia's words, I totally agree. And also, that small transport operators and municipalities with fewer resources have at their disposal all this open and free quality data and access to digital tools that allow them to compete on equal terms. In the case of companies or municipalities, imagine being able to plan their transport and be more efficient. Not only does it save them money, but they win in the end in the service to the citizen. And of course, the fact that in the public sector decisions are made based on data and this ecosystem of data sharing is encouraged, favouring the development of mobility applications, for example, has a direct impact on people's daily lives. Or also the issue of transport aid: the study of the impact of transport subsidies with accessibility data and so on. You study who are the most vulnerable and in the end, what do you do? Well, that policies are increasingly fairer and this obviously impacts the citizen. That decisions about how to invest everyone's money, our taxes, how to invest it in infrastructure or aid or services, should be based on objective data and not on intuitions, but on real data. This is the most important thing.

calendar icon
Blog

Cities, infrastructures and the environment today generate a constant flow of data from sensors, transport networks, weather stations and Internet of Things (IoT) platforms, understood as networks of physical devices (digital traffic lights, air quality sensors, etc.) capable of measuring and transmitting information through digital systems. This growing volume of information makes it possible to improve the provision of public services, anticipate emergencies, plan the territory and respond to challenges associated with climate, mobility or resource management.

The increase in connected sources has transformed the nature of geospatial data. In contrast to traditional sets – updated periodically and oriented towards reference cartography or administrative inventories – dynamic data incorporate the temporal dimension as a structural component. An observation of air quality, a level of traffic occupancy or a hydrological measurement not only describes a phenomenon, but also places it at a specific time. The combination of space and time makes these observations fundamental elements for operating systems, predictive models and analyses based on time series.

In the field of open data, this type of information poses both opportunities and specific requirements. Opportunities include the possibility of building reusable digital services, facilitating near-real-time monitoring of urban and environmental phenomena, and fostering a reuse ecosystem based on continuous flows of interoperable data. The availability of up-to-date data also increases the capacity for evaluation and auditing of public policies, by allowing decisions to be contrasted with recent observations.

However, the opening of geospatial data in real time requires solving problems derived from technological heterogeneity. Sensor networks use different protocols, data models, and formats; the sources generate high volumes of observations with high frequency; and the absence of common semantic structures makes it difficult to cross-reference data between domains such as mobility, environment, energy or hydrology. In order for this data to be published and reused consistently, an interoperability framework is needed that standardizes the description of observed phenomena, the structure of time series, and access interfaces.

The open standards of the Open Geospatial Consortium (OGC) provide that framework. They define how to represent observations, dynamic entities, multitemporal coverages or sensor systems; establish APIs based on web principles that facilitate the consultation of open data; and allow different platforms to exchange information without the need for specific integrations. Its adoption reduces technological fragmentation, improves coherence between sources and favours the creation of public services based on up-to-date data.

Interoperability: The basic requirement for opening dynamic data

Public administrations today manage data generated by sensors of different types, heterogeneous platforms, different suppliers and systems that evolve independently. The publication of geospatial data in real time requires interoperability that allows information from multiple sources to be integrated, processed and reused. This diversity causes inconsistencies in formats, structures, vocabularies and protocols, which makes it difficult to open the data and reuse it by third parties. Let's see which aspects of interoperability are affected:

  • Technical interoperability: refers to the ability of systems to exchange data using compatible interfaces, formats and models. In real-time data, this exchange requires mechanisms that allow for fast queries, frequent updates, and stable data structures. Without these elements, each flow would rely on ad hoc integrations, increasing complexity and reducing reusability.
  • The Semantic interoperability: Dynamic data describe phenomena that change over short periods – traffic levels, weather parameters, flows, atmospheric emissions – and must be interpreted consistently. This implies having observation models, Vocabularies and common definitions that allow different applications to understand the meaning of each measurement and its units, capture conditions or constraints. Without this semantic layer, the opening of data in real time generates ambiguity and limits its integration with data from other domains.
  • Structural interoperability: Real-time data streams tend to be continuous and voluminous, making it necessary to represent them as time series or sets of observations with consistent attributes. The absence of standardized structures complicates the publication of complete data, fragments information and prevents efficient queries. To provide open access to these data, it is necessary to adopt models that adequately represent the relationship between observed phenomenon, time of observation, associated geometry and measurement conditions.
  • Interoperability in access via API: it is an essential condition for open data. APIs must be stable, documented, and based on public specifications that allow for reproducible queries. In the case of dynamic data, this layer guarantees that the flows can be consumed by external applications, analysis platforms, mapping tools or monitoring systems that operate in contexts other than the one that generates the data. Without interoperable APIs, real-time data is limited to internal uses.

Together, these levels of interoperability determine whether dynamic geospatial data can be published as open data without creating technical barriers.

OGC Standards for Publishing Real-Time Geospatial Data

The publication of georeferenced data in real time requires mechanisms that allow any user – administration, company, citizens or research community – to access them easily, with open formats and through stable interfaces. The Open Geospatial Consortium (OGC) develops a set of standards that enable exactly this: to describe, organize and expose spatial data in an interoperable and accessible way, which contributes to the openness of dynamic data.

What is OGC and why are its standards relevant?

The OGC is an international organization that defines common rules so that different systems can understand, exchange and use geospatial data without depending on specific technologies. These rules are published as open standards, which means that any person or institution can use them. In the realm of real-time data, these standards make it possible to:

  • Represent what a sensor measures (e.g., temperature or traffic).
  • Indicate where and when the observation was made.
  • Structure time series.
  • Expose data through open APIs.
  • Connect IoT devices and networks with public platforms.

Together, this ecosystem of standards allows geospatial data – including data generated in real time – to be published and reused following a consistent framework. Each standard covers a specific part of the data cycle: from the definition of observations and sensors, to the way data is exposed using open APIs or web services. This modular organization makes it easier for administrations and organizations to select the components they need, avoiding technological dependencies and ensuring that data can be integrated between different platforms.

The OGC API family: Modern APIs for accessing open data

Within OGC, the newest line is family OGC API, a set of modern web interfaces designed to facilitate access to geospatial data using URLs and formats such as JSON or GeoJSON, common in the open data ecosystem.

Estas API permiten:

  • Get only the part of the data that matters.
  • Perform spatial searches ("give me only what's in this area").
  • Access up-to-date data without the need for specialized software.
  • Easily integrate them into web or mobile applications.

In this report: "How to use OGC APIs to boost geospatial data interoperability", we already told you about some of the most popular OGP APIs. While the report focuses on how to use OGC APIs for practical interoperability, this post expands the focus by explaining the underlying OGC data models—such as O&M, SensorML, or Moving Features—that underpin that interoperability.

On this basis, this post focuses on the standards that make this fluid exchange of information possible, especially in open data and real-time contexts. The most important standards in the context of real-time open data are:

OGC Standard What it allows you to do Primary use in open data
OGC API – Features
It is an open web interface that allows access to datasets made up of "entities" with geometry, such as sensors, vehicles, stations or incidents. It uses simple formats such as JSON and GeoJSON and allows spatial and temporal queries. It is useful for publishing data that is frequently updated, such as urban mobility or dynamic inventories.
Query features with geometry; filter by time or space; get data in JSON/GeoJSON.
 

Open publication of dynamic mobility data, urban inventories, static sensors.

OGC API – Environmental Data Retrieval (EDR)
It provides a simple method for retrieving environmental and meteorological observations. It allows data to be requested at a point, an area or a time range, and is particularly suitable for weather stations, air quality or climate models. Facilitates open access to time series and predictions.

Request environmental observations at a point, zone or time interval.

Open data on meteorology, climate, air quality or hydrology.

OGC SensorThings API
It is the most widely used standard for open IoT data. It defines a uniform model for sensors, what they measure and the observations they produce. It is designed to handle large volumes of data in real time and offers a clear way to publish time series, pollution, noise, hydrology, energy or lighting data.

Manage sensors and their time series; transmit large volumes of IoT data.

Publication of urban sensors (air, noise, water, energy) in real time.

OGC API – Connected Systems
It allows sensor systems to be described in an open and structured way: what devices exist, how they are connected to each other, in what infrastructure they are integrated and what kind of measurements they generate. It complements the SensorThings API in that it does not focus on observations, but on the physical and logical network of sensors.
Describe networks of sensors, devices and associated infrastructures.
 
Document the structure of municipal IoT systems as open data.

 

OGC Moving Features
Model to represent objects that move, such as vehicles, boats or people, through space-time trajectories. It allows mobility, navigation or logistics data to be published in formats consistent with open data principles.

Represent moving objects using space-time trajectories. Open mobility data (vehicles, transport, boats).
WMS-T
Extension of the classic WMS standard that adds the time dimension. It allows you to view maps that change over time, for example, hourly weather, flood levels or regularly updated images.
 
View maps that change over time. Publication of multi-temporal weather or environmental maps.

Table 1. OGC Standards Relevant to Real-Time Geospatial Data

Models that structure observations and dynamic data

In addition to APIs, OGC defines several conceptual data models that allow you to consistently describe observations, sensors, and phenomena that change over time:

  • O&M (Observations & Measurements)A model that defines the essential elements of an observation—measured phenomenon, instant, unity, and result—and serves as the semantic basis for sensor and time series data.
  • SensorML: Language that describes the technical and operational characteristics of a sensor, including its location, calibration, and observation process.
  • Moving FeaturesA model that allows mobile objects to be represented by means of space-time trajectories (such as vehicles, boats or fauna).

These models make it easy for different data sources to be interpreted uniformly and combined in analytics and applications.

The value of these standards for open data

Using OGC standards makes it easier to open dynamic data because:

  • It provides common models that reduce heterogeneity between sources.
  • It facilitates integration between domains (mobility, climate, hydrology).
  • Avoid dependencies on proprietary technology.
  • It allows the data to be reused in analytics, applications, or public services.
  • Improves transparency by documenting sensors, methods, and frequencies.
  • It ensures that data can be consumed directly by common tools.

Together, they form a conceptual and technical infrastructure that allows real-time geospatial data to be published as open data, without the need to develop system-specific solutions.

Real-time open geospatial data use cases

Real-time georeferenced data is already published as open data in different sectoral areas. These examples show how different administrations and bodies apply open standards and APIs to make dynamic data related to mobility, environment, hydrology and meteorology available to the public.

Below are several domains where Public Administrations already publish dynamic geospatial data using OGC standards.

Mobility and transport

Mobility systems generate data continuously: availability of shared vehicles, positions in near real-time, sensors for crossing in cycle lanes, traffic gauging or traffic light intersection status. These observations rely on distributed sensors and require data models capable of representing rapid variations in space and time.

OGC standards play a central role in this area. In particular, the OGC SensorThings API allows you to structure and publish observations from urban sensors using a uniform model – including devices, measurements, time series and relationships between them – accessible through an open API. This makes it easier for different operators and municipalities to publish mobility data in an interoperable way, reducing fragmentation between platforms.

The use of OGC standards in mobility not only guarantees technical compatibility, but also makes it possible for this data to be reused together with environmental, cartographic or climate information, generating multi-thematic analyses for urban planning, sustainability or operational transport management.

Example:

The open service of Toronto Bike Share, which publishes in SensorThings API format the status of its bike stations and vehicle availability.

Here each station is a sensor and each observation indicates the number of bicycles available at a specific time. This approach allows analysts, developers or researchers to integrate this data directly into urban mobility models, demand prediction systems or citizen dashboards without the need for specific adaptations.

Air quality, noise and urban sensors

Networks for monitoring air quality, noise or urban environmental conditions depend on automatic sensors that record measurements every few minutes. In order for this data to be integrated into analytics systems and published as open data, consistent models and APIs need to be available.

In this context, services based on OGC standards make it possible to publish data from fixed stations or distributed sensors in an interoperable way. Although many administrations use traditional interfaces such as OGC WMS to serve this data, the underlying structure is usually supported by observation models derived from the Observations & Measurements (O&M) family, which defines how to represent a measured phenomenon, its unit and the moment of observation.

Example:

The service Defra UK-AIR Sensor Observation Service provides access to near-real-time air quality measurement data from on-site stations in the UK.

The combination of O&M for data structure and open APIs for publication makes it easier for these urban sensors to be part of broader ecosystems that integrate mobility, meteorology or energy, enabling advanced urban analyses or environmental dashboards in near real-time.

Water cycle, hydrology and risk management

Hydrological systems generate crucial data for risk management: river levels and flows, rainfall, soil moisture or information from hydrometeorological stations. Interoperability is especially important in this domain, as this data is combined with hydraulic models, weather forecasting, and flood zone mapping.

To facilitate open access to time series and hydrological observations, several agencies use OGC API – Environmental Data Retrieval (EDR), an API designed to retrieve environmental data using simple queries at points, areas, or time intervals.

Example:

The USGS (United States Geological Survey), which documents the use of OGC API – EDR to access precipitation, temperature, or hydrological variable series.

This case shows how EDR allows you to request specific observations by location or date, returning only the values needed for analysis. While the USGS's specific hydrology data is served through its proprietary API, this case demonstrates how EDR fits into the hydrometeorological data structure and how it is applied in real operational flows.

The use of OGC standards in this area allows dynamic hydrological data to be integrated with flood zones, orthoimages or climate models, creating a solid basis for early warning systems, hydraulic planning and risk assessment.

Weather observation and forecasting

Meteorology is one of the domains with the highest production of dynamic data: automatic stations, radars, numerical prediction models, satellite observations and high-frequency atmospheric products. To publish this information as open data, the OGC API family  is becoming a key element, especially through OGC API – EDR, which allows observations or predictions to be retrieved in specific locations and at different time levels.

Example: 

The service NOAA OGC API – EDR, which provides access to weather data and atmospheric variables from the National Weather Service (United States).

This API allows data to be consulted at points, areas or trajectories, facilitating the integration of meteorological observations into external applications, models or services based on open data.

The use of OGC API in meteorology allows data from sensors, models, and satellites to be consumed through a unified interface, making it easy to reuse for forecasting, atmospheric analysis, decision support systems, and climate applications.

Best Practices for Publishing Open Geospatial Data in Real-Time 

The publication of dynamic geospatial data requires adopting practices that ensure its accessibility, interoperability, and sustainability. Unlike static data, real-time streams have additional requirements related to the quality of observations, API stability, and documentation of the update process. Here are some best practices for governments and organizations that manage this type of data.

  • Stable open formats and APIs: The use of OGC standards – such as OGC API, SensorThings API or EDR – makes it easy for data to be consumed from multiple tools without the need for specific adaptations. APIs must be stable over time, offer well-defined versions, and avoid dependencies on proprietary technologies. For raster data or dynamic models, OGC services such as WMS, WMTS, or WCS are still suitable for visualization and programmatic access.
  • DCAT-AP and OGC Models Compliant Metadata: Catalog interoperability requires describing datasets using profiles such as DCAT-AP, supplemented by O&M-based geospatial and observational metadata or SensorML. This metadata should document the nature of the sensor, the unit of measurement, the sampling rate, and possible limitations of the data.
  • Quality, update frequency and traceability policies: dynamic datasets must explicitly indicate their update frequency, the origin of the observations, the validation mechanisms applied and the conditions under which they were generated. Traceability is essential for third parties to correctly interpret data, reproduce analyses and integrate observations from different sources.
  • Documentation, usage limits, and service sustainability: Documentation should include usage examples, query parameters, response structure, and recommendations for managing data volume. It is important to set reasonable query limits to ensure the stability of the service and ensure that management can maintain the API over the long term.
  • Licensing aspects for dynamic data: The license must be explicit and compatible with reuse, such as CC BY 4.0 or CC0. This allows dynamic data to be integrated into third-party services, mobile applications, predictive models or services of public interest without unnecessary restrictions. Consistency in the license also facilitates the cross-referencing of data from different sources.

These practices allow dynamic data to be published in a way that is reliable, accessible, and useful to the entire reuse community.

Dynamic geospatial data has become a structural piece for understanding urban, environmental and climatic phenomena. Its publication through open standards allows this information to be integrated into public services, technical analyses and reusable applications without the need for additional development. The convergence of observation models, OGC APIs, and best practices in metadata and licensing provides a stable framework for administrations and reusers to work with sensor data reliably. Consolidating this approach will allow progress towards a more coherent, connected public data ecosystem that is prepared for increasingly demanding uses in mobility, energy, risk management and territorial planning.

Content created by Mayte Toscano, Senior Consultant in Technologies related to the data economy. The content and views expressed in this publication are the sole responsibility of the author.

calendar icon
Blog

We live in an age where more and more phenomena in the physical world can be observed, measured, and analyzed in real time. The temperature of a crop, the air quality of a city, the state of a dam, the flow of traffic or the energy consumption of a building are no longer data that are occasionally reviewed: they are continuous flows of information that are generated second by second.

This revolution would not be possible without cyber-physical systems (CPS), a technology that integrates sensors, algorithms and actuators to connect the physical world with the digital world. But CPS does not only generate data: it can also be fed by open data, multiplying its usefulness and enabling evidence-based decisions.

In this article, we will explore what CPS is, how it generates massive data in real time, what challenges it poses to turn that data into useful public information, what principles are essential to ensure its quality and traceability, and what real-world examples demonstrate the potential for its reuse. We will close with a reflection on the impact of this combination on innovation, citizen science and the design of smarter public policies.

What are cyber-physical systems?

A cyber-physical system is a tight integration between digital components – such as software, algorithms, communication and storage – and physical components – sensors, actuators, IoT devices or industrial machines. Its main function is to observe the environment, process information and act on it.

Unlike traditional monitoring systems, a CPS is not limited to measuring: it closes a complete loop between perception, decision, and action. This cycle can be understood through three main elements:


Figure 1. Cyber-physical systems cycle. Source: own elaboration

An everyday example that illustrates this complete cycle of perception, decision and action very well is smart irrigation, which is increasingly present in precision agriculture and home gardening systems. In this case, sensors distributed throughout the terrain continuously measure soil moisture, ambient temperature, and even solar radiation. All this information flows to the computing unit, which analyzes the data, compares it with previously defined thresholds or with more complex models – for example, those that estimate the evaporation of water or the water needs of each type of plant – and determines whether irrigation is really necessary.

When the system concludes that the floor has reached a critical level of dryness, the third element of CPS comes into play: the actuators. They are the ones who open the valves, activate the water pump or regulate the flow rate, and they do so for the exact time necessary to return the humidity to optimal levels. If conditions change—if it starts raining, if the temperature drops, or if the soil recovers moisture faster than expected—the system itself adjusts its behavior accordingly.

This whole process happens without human intervention, autonomously. The result is a more sustainable use of water, better cared for plants and a real-time adaptability that is only possible thanks to the integration of sensors, algorithms and actuators characteristic of cyber-physical systems.

CPS as real-time data factories

One of the most relevant characteristics of cyber-physical systems is their ability to generate data continuously, massively and with a very high temporal resolution. This constant production can be seen in many day-to-day situations:

  • A hydrological station can record level and flow every minute.

  • An urban mobility sensor can generate hundreds of readings per second.

  • A smart meter records electricity consumption every few minutes.

  • An agricultural sensor measures humidity, salinity, and solar radiation several times a day.

  • A mapping drone captures decimetric GPS positions in real time.

Beyond these specific examples, the important thing is to understand what this capability means for the system as a whole: CPS become true data factories, and in many cases come to function as digital twins of the physical environment they monitor. This almost instantaneous equivalence between the real state of a river, a crop, a road or an industrial machine and its digital representation allows us to have an extremely accurate and up-to-date portrait of the physical world, practically at the same time as the phenomena occur.

This wealth of data opens up a huge field of opportunity when published as open information. Data from CPS can drive innovative services developed by companies, fuel high-impact scientific research, empower citizen science initiatives that complement institutional data, and strengthen transparency and accountability in the management of public resources.

However, for all this value to really reach citizens and the reuse community, it is necessary to overcome a series of technical, organisational and quality challenges that determine the final usefulness of open data. Below, we look at what those challenges are and why they are so important in an ecosystem that is increasingly reliant on real-time generated information.

The challenge: from raw data to useful public information

Just because a CPS generates data does not mean that it can be published directly as open data. Before reaching the public and reuse companies, the information needs prior preparation , validation, filtering and documentation. Administrations must ensure that such data is understandable, interoperable and reliable. And along the way, several challenges appear.

One of the first is standardization. Each manufacturer, sensor and system can use different formats, different sample rates or its own structures. If these differences are not harmonized, what we obtain is a mosaic that is difficult to integrate. For data to be interoperable, common models, homogeneous units, coherent structures, and shared standards are needed. Regulations such as INSPIRE or the OGC (Open Geospatial Consortium) and IoT-TS standards are key so that data generated in one city can be understood, without additional transformation, in another administration or by any reuser.

The next big challenge is quality. Sensors can fail, freeze always reporting the same value, generate physically impossible readings, suffer electromagnetic interference or be poorly calibrated for weeks without anyone noticing. If this information is published as is, without a prior review and cleaning process, the open data loses value and can even lead to errors. Validation – with automatic checks and periodic review – is therefore indispensable.

Another critical point is contextualization. An isolated piece of information is meaningless. A "12.5" says nothing if we don't know if it's degrees, liters or decibels. A measurement of "125 ppm" is useless if we do not know what substance is being measured. Even something as seemingly objective as coordinates needs a specific frame of reference. And any environmental or physical data can only be properly interpreted if it is accompanied by the date, time, exact location and conditions of capture. This is all part of metadata, which is essential for third parties to be able to reuse information unambiguously.

It's also critical to address privacy and security. Some CPS can capture information that, directly or indirectly, could be linked to sensitive people, property, or infrastructure. Before publishing the data, it is necessary to apply anonymization processes, aggregation techniques, security controls and impact assessments that guarantee that the open data does not compromise rights or expose critical information.

Finally, there are operational challenges such as refresh rate and robustness of data flow. Although CPS generates information in real time, it is not always appropriate to publish it with the same granularity: sometimes it is necessary to aggregate it, validate temporal consistency or correct values before sharing it. Similarly, for data to be useful in technical analysis or in public services, it must arrive without prolonged interruptions or duplication, which requires a stable infrastructure and monitoring mechanisms.

Quality and traceability principles needed for reliable open data

Once these challenges have been overcome, the publication of data from cyber-physical systems must be based on a series of principles of quality and traceability. Without them, information loses value and, above all, loses trust.

The first is accuracy. The data must faithfully represent the phenomenon it measures. This requires properly calibrated sensors, regular checks, removal of clearly erroneous values, and checking that readings are within physically possible ranges. A sensor that reads 200°C at a weather station or a meter that records the same consumption for 48 hours are signs of a problem that needs to be detected before publication.

The second principle is completeness. A dataset should indicate when there are missing values, time gaps, or periods when a sensor has been disconnected. Hiding these gaps can lead to wrong conclusions, especially in scientific analyses or in predictive models that depend on the continuity of the time series.

The third key element is traceability, i.e. the ability to reconstruct the history of the data. Knowing which sensor generated it, where it is installed, what transformations it has undergone, when it was captured or if it went through a cleaning process allows us to evaluate its quality and reliability. Without traceability, trust erodes and data loses value as evidence.

Proper updating is another fundamental principle. The frequency with which information is published must be adapted to the phenomenon measured. Air pollution levels may need updates every few minutes; urban traffic, every second; hydrology, every minute or every hour depending on the type of station; and meteorological data, with variable frequencies. Posting too quickly can generate noise; too slow, it can render the data useless for certain uses.

The last principle is that of rich metadata. Metadata explains the data: what it measures, how it is measured, with what unit, how accurate the sensor is, what its operating range is, where it is located, what limitations the measurement has and what this information is generated for. They are not a footnote, but the piece that allows any reuser to understand the context and reliability of the dataset. With good documentation, reuse isn't just possible: it skyrockets.

Examples: CPS that reuses public data to be smarter

In addition to generating data, many cyber-physical systems also consume public data to improve their performance. This feedback makes open data a central resource for the functioning of smart territories. When a CPS integrates information from its own sensors with external open sources, its anticipation, efficiency, and accuracy capabilities are dramatically increased.

Precision agriculture: In agriculture, sensors installed in the field allow variables such as soil moisture, temperature or solar radiation to be measured. However, smart irrigation systems do not rely solely on this local information: they also incorporate weather forecasts from AEMET, open IGN maps  on slope or soil types, and climate models published as public data. By combining their own measurements with these external sources, agricultural CPS can determine much more accurately which areas of the land need water, when to plant, and how much moisture should be maintained in each crop. This fine management allows water and fertilizer savings that, in some cases, exceed 30%.

Water management: Something similar happens in water management. A cyber-physical system that controls a dam or irrigation canal needs to know not only what is happening at that moment, but also what may happen in the coming hours or days. For this reason, it integrates its own level sensors with open data on river gauging, rain and snow predictions, and even public information on ecological flows. With this expanded vision, the CPS can anticipate floods, optimize the release of the reservoir, respond better to extreme events or plan irrigation sustainably. In practice, the combination of proprietary and open data translates into safer and more efficient water management.

Impact: innovation, citizen science, and data-driven decisions

The union between cyber-physical systems and open data generates a multiplier effect that is manifested in different areas.

  • Business innovation: Companies have fertile ground to develop solutions based on reliable and real-time information. From open data and CPS measurements, smarter mobility applications, water management platforms, energy analysis tools, or predictive systems for agriculture can emerge. Access to public data lowers barriers to entry and allows services to be created without the need for expensive  private datasets, accelerating innovation and the emergence of new business models.

  • Citizen science: the combination of SCP and open data also strengthens social participation. Neighbourhood communities, associations or environmental groups can deploy low-cost sensors to complement public data and better understand what is happening in their environment. This gives rise to initiatives that measure noise in school zones, monitor pollution levels in specific neighbourhoods, follow the evolution of biodiversity or build collaborative maps that enrich official information.

  • Better public decision-making: finally, public managers benefit from this strengthened data ecosystem. The availability of reliable and up-to-date measurements makes it possible to design low-emission zones, plan urban transport more effectively, optimise irrigation networks, manage drought or flood situations or regulate energy policies based on real indicators. Without open data that complements and contextualizes the information generated by the CPS, these decisions would be less transparent and, above all, less defensible to the public.

In short, cyber-physical systems have become an essential piece of understanding and managing the world around us. Thanks to them, we can measure phenomena in real time, anticipate changes and act in a precise and automated way. But its true potential unfolds when its data is integrated into a quality open data ecosystem, capable of providing context, enriching decisions and multiplying uses.

The combination of SPC and open data allows us to move towards smarter territories, more efficient public services and more informed citizen participation. It provides economic value, drives innovation, facilitates research and improves decision-making in areas as diverse as mobility, water, energy or agriculture.

For all this to be possible, it is essential to guarantee the quality, traceability and standardization of the published data, as well as to protect privacy and ensure the robustness of information flows. When these foundations are well established, CPS not only measure the world: they help it improve, becoming a solid bridge between physical reality and shared knowledge.

Content created by Dr. Fernando Gualo, Professor at UCLM and Government and Data Quality Consultant. The content and views expressed in this publication are the sole responsibility of the author.

calendar icon
Blog

Quantum computing promises to solve problems in hours that would take millennia for the world's most powerful supercomputers. From designing new drugs to optimizing more sustainable energy grids, this technology will radically transform our ability to address humanity's most complex challenges. However, its true democratizing potential will only be realized through convergence with open data, allowing researchers, companies, and governments around the world to access both quantum computing power in the cloud and the  public datasets needed to train and validate quantum algorithms.

Trying to explain quantum theory has always been a challenge, even for the most brilliant minds humanity has given in the last 2 centuries. The celebrated physicist Richard Feynman (1918-1988) put it with his trademark humor:

"There was a time when newspapers said that only twelve men understood the theory of relativity. I don't think it was ever like that [...] On the other hand, I think I can safely say that no one understands quantum mechanics." 

And that was said by one of the most brilliant physicists of the twentieth century, Nobel Prize winner and one of the fathers of quantum electrodynamics. So great is the rarity of quantum behavior in the eyes of a human that, even Albert Einstein himself, in his now mythical phrase, said to Max Born, in a letter written to the German physicist in 1926, "God does not play dice with the universe" in reference to his disbelief about the probabilistic and non-deterministic properties attributed to quantum behavior. To which Niels Bohr - another titan of physics of the twentieth century - replied: "Einstein, stop telling God what to do."

Classical computing

If we want to understand why quantum mechanics proposes a revolution in computer science, we have to understand its fundamental differences from mechanics - and, therefore, - classical computing. Almost all of us have heard of bits of information at some point in our lives. Humans have developed a way of performing complex mathematical calculations by reducing all information to bits - the fundamental units of information with which a machine knows how to work -, which are the famous zeros and ones (0 and 1). With two simple values, we have been able to model our entire mathematical world. And why? Some will ask. Why base 2 and not 5 or 7? Well, in our classic physical world (in which we live day to day) differentiating between 0 and 1 is relatively simple; on and off, as in the case of an electrical switch, or north or south magnetization, in the case of a magnetic hard drive. For a binary world, we have developed an entire coding language based on two states: 0 and 1.

Quantum computing

In quantum computing, instead of bits, we use qubits. Qubits use several "strange" properties of quantum mechanics that allow them to represent infinite states at once between zero and one of the classic bits. To understand it, it's as if a bit could only represent an on or off state in a light bulb, while a qubit can represent all the light bulb's illumination intensities. This property is known as "quantum superposition" and allows a quantum computer to explore millions of possible solutions at the same time. But this is not all in quantum computing. If quantum superposition seems strange to you, wait until you see quantum entanglement. Thanks to this property, two  "entangled" particles (or two qubits) are connected "at a distance" so that the state of one determines the state of the other. So, with these two properties we have information qubits, which can represent infinite states and are connected to each other. This system potentially has an exponentially greater computing capacity than our computers based on classical computing.     

Two application cases of quantum computing

1. Drug discovery and personalized medicine. Quantum computers can simulate complex molecular interactions that are impossible to compute with classical computing. For example, protein folding – fundamental to understanding diseases such as Alzheimer's – requires analyzing trillions of possible configurations. A quantum computer could shave years of research to weeks, speeding up the development of new drugs and personalized treatments based on each patient's genetic profile.

2. Logistics optimization and climate change. Companies like Volkswagen already use quantum computing to optimize traffic routes in real time. On a larger scale, these systems could revolutionize the energy management of entire cities, optimizing smart grids that integrate renewables efficiently, or design new materials for CO2 capture that help combat climate change.

A good read recommended for a complete review of quantum computing here.

The role of open data (and computing resources)

The democratization of access to quantum computing will depend crucially on two pillars: open computing resources and  quality public datasets. This combination is creating an ecosystem where quantum innovation no longer requires millions of dollars in infrastructure. Here are some options available for each of these pillars.

  1. Free access to real quantum hardware:
  • IBM Quantum Platform: Provides free monthly access to quantum systems of more than 100 qubits for anyone in the world. With more than 400,000 registered users who have generated more than 2,800 scientific publications, it demonstrates how open access accelerates research. Any researcher can sign up for the platform and start experimenting in minutes.
  • Open Quantum Institute (OQI): launched at CERN (the European Organization for Nuclear Research) in 2024, it goes further, providing not only access to quantum computing but also mentoring and educational resources for underserved regions. Its hackathon program in 2025 includes events in Lebanon, the United Arab Emirates, and other countries, specifically designed to mitigate the quantum digital divide.
  1. Public datasets for the development of quantum algorithms:
  • QDataSet: Offers 52 public datasets with simulations of one- and two-qubit quantum systems, freely available for training  quantum machine learning (ML) algorithms. Researchers without resources to generate their own simulation data can access its repository on GitHub and start developing algorithms immediately.
  • ClimSim: This is a public climate-related modeling dataset that is already being used to demonstrate the first quantum ML algorithms applied to climate change. It allows any team, regardless of their budget, to work on real climate problems using quantum computing.
  • PennyLane Datasets: is  an open collection of molecules, quantum circuits, and physical systems that allows  pharmaceutical startups without resources to perform expensive simulations and experiment with quantum-assisted drug discovery.

Real cases of inclusive innovation

The possibilities offered by the use of open data to quantum computing have been evident in various use cases, the result of specific research and calls for grants, such as:

Current State of Quantum Computing

We are in the NISQ (Noisy Intermediate-Scale Quantum) era, a term coined by physicist John Preskill in 2018, which describes quantum computers with 50-100  physical qubits. These systems are powerful enough to perform certain calculations beyond the classical capabilities, but they suffer from incoherence, frequent errors that make them unviable in market applications.

IBM, Google, and startups like IonQ offer cloud access to their quantum systems, with IBM providing public access through the IBM Quantum Platform since 2016, being one of the first publicly accessible quantum processors connected to the cloud.

In 2019, Google achieved "quantum supremacy" with its 53-qubit Sycamore processor, which performed a calculation in about 200 seconds that would take about 10,000 years to a state-of-the-art classical supercomputer.

The latest independent analyses suggest that practical quantum applications may emerge around 2035-2040, assuming continued exponential growth in quantum hardware capabilities. IBM has committed to delivering a large-scale fault-tolerant quantum computer, IBM Quantum Starling, by 2029, with the goal of running quantum circuits comprising 100 million quantum gates on 200  logical qubits.


The Global Race for Quantum Leadership

International competition for dominance in quantum technologies has triggered an unprecedented wave of investment. According to McKinsey, until 2022 the officially recognized level of public investment in China (15,300 million dollars) exceeds that of the European Union (7,200 million dollars), the United States 1,900 million dollars) and Japan (1,800 million dollars) combined.

Domestically, the UK has committed £2.5 billion over ten years to its National Quantum Strategy to make the country a global hub for quantum computing, and Germany has made one of the largest strategic investments in quantum computing, allocating €3 billion under its economic stimulus plan.

Investment in the first quarter of 2025 shows explosive growth: quantum computing companies raised more than $1.25 billion, more than double the previous year, an increase of 128%, reflecting a growing confidence that this technology is approaching commercial relevance.

To end the section, a fantastic short interview with Ignacio Cirac, one of the "Spanish fathers" of quantum computing.

Quantum Spain Initiative

In the case of Spain, 60 million euros have been invested in Quantum Spain, coordinated by the Barcelona Supercomputing Center. The project includes:

  • Installation of the first quantum computer in southern Europe.
  • Network of 25 research nodes distributed throughout the country.
  • Training of quantum talent in Spanish universities.
  • Collaboration with the business sector for real-world use cases.

This initiative positions Spain as a quantum hub in southern Europe, crucial for not being technologically dependent on other powers.

In addition, Spain's Quantum Technologies Strategy has recently been presented with an investment of 800 million euros. This strategy is structured into 4 strategic objectives and 7 priority actions.

Strategic objectives:

  • Strengthen R+D+I to promote the transfer of knowledge and facilitate research reaching the market.
  • To create a Spanish quantum market, promoting the growth and emergence of quantum companies and their ability to access capital and meet demand.
  • Prepare society for disruptive change, promoting security and reflection on a new digital right, post-quantum privacy.
  • Consolidate the quantum ecosystem in a way that drives a vision of the country.

Priority actions:

  • Priority 1: To promote Spanish companies in quantum technologies.
  • Priority 2: Develop algorithms and technological convergence between AI and Quantum.
  • Priority 3: Position Spain as a benchmark in quantum communications.
  • Priority 4: Demonstrate the impact of quantum sensing and metrology.
  • Priority 5: Ensure the privacy and confidentiality of information in the post-quantum world.
  • Priority 6: Strengthening capacities: infrastructure, research and talent.
  • Priority 7: Develop a solid, coordinated and leading Spanish quantum ecosystem in the EU.

In short, quantum computing and open data represent a major technological evolution that affects the way we generate and apply knowledge. If we can build a truly inclusive ecosystem—where access to quantum hardware, public datasets, and specialized training is within anyone's reach—we will open the door to a new era of collaborative innovation with a major global impact.

Content created by Alejandro Alija, expert in Digital Transformation and Innovation. The content and views expressed in this publication are the sole responsibility of the author.

calendar icon
Noticia

The European open data portal has published the third volume of its Use Case Observatory, a report that compiles the evolution of data reuse projects across Europe. This initiative highlights the progress made in four areas: economic, governmental, social and environmental impact.

The closure of a three-year investigation

Between 2022 and 2025, the European Open Data Portal has systematically monitored the evolution of various European projects. The research began with an initial selection of 30 representative initiatives, which were analyzed in depth to identify their potential for impact.

After two years, 13 projects continued in the study, including three Spanish ones: PlanttesTangible Data and UniversiDATA-Lab. Its development over time was studied to understand how the reuse of open data can generate real and sustainable benefits.

The publication of volume III in October 2025 marks the closure of this series of reports, following volume I (2022) and volume II (2024). This last document offers a longitudinal view, showing how the projects have matured in three years of observation and what concrete impacts they have generated in their respective contexts.

Common conclusions

This third and final report compiles a number of key findings:

Economic impact

Open data drives growth and efficiency across industries. They contribute to job creation, both directly and indirectly, facilitate smarter recruitment processes and stimulate innovation in areas such as urban planning and digital services.

The report shows the example of:

  •  Naar Jobs (Belgium): an application for job search close to users' homes and focused on the available transport options.

This application demonstrates how open data can become a driver for regional employment and business development.

Government impact

The opening of data strengthens transparency, accountability and citizen participation.

Two use cases analysed belong to this field:

Both examples show how access to public information empowers citizens, enriches the work of the media, and supports evidence-based policymaking. All of this helps to strengthen democratic processes and trust in institutions.

Social impact

Open data promotes inclusion, collaboration, and well-being.

The following initiatives analysed belong to this field:

  • UniversiDATA-Lab (Spain): university data repository that facilitates analytical applications.
  • VisImE-360 (Italy): a tool to map visual impairment and guide health resources.
  • Tangible Data (Spain): a company focused on making physical sculptures that turn data into accessible experiences.
  • EU Twinnings (Netherlands): platform that compares European regions to find "twin cities"
  • Open Food Facts (France): collaborative database on food products.
  • Integreat (Germany): application that centralizes public information to support the integration of migrants.

All of them show how data-driven solutions can amplify the voice of vulnerable groups, improve health outcomes and open up new educational opportunities. Even the smallest effects, such as improvement in a single person's life, can prove significant and long-lasting.

Environmental impact

Open data acts as a powerful enabler of sustainability.

As with environmental impact, in this area we find a large number of use cases:

  • Digital Forest Dryads (Estonia): a project that uses data to monitor forests and promote their conservation.
  • Air Quality in Cyprus (Cyprus): platform that reports on air quality and supports environmental policies.
  • Planttes (Spain): citizen science app that helps people with pollen allergies by tracking plant phenology.
  • Environ-Mate (Ireland): a tool that promotes sustainable habits and ecological awareness.

These initiatives highlight how data reuse contributes to raising awareness, driving behavioural change and enabling targeted interventions to protect ecosystems and strengthen climate resilience.

Volume III also points to common challenges: the need for sustainable financing, the importance of combining institutional data with citizen-generated data, and the desirability of involving end-users throughout the project lifecycle. In addition, it underlines the importance of European collaboration and transnational interoperability to scale impact.

Overall, the report reinforces the relevance of continuing to invest in open data ecosystems as a key tool to address societal challenges and promote inclusive transformation.

The impact of Spanish projects on the reuse of open data

As we have mentioned, three of the use cases analysed in the Use Case Observatory have a Spanish stamp. These initiatives stand out for their ability to combine technological innovation with social and environmental impact, and highlight Spain 's relevance within the European open data ecosystem. His career demonstrates how our country actively contributes to transforming data into solutions that improve people's lives and reinforce sustainability and inclusion. Below, we zoom in on what the report says about them.

Planks

This citizen science initiative helps people with pollen allergies through real-time information about allergenic plants in bloom. Since its appearance in Volume I of the Use Case Observatory it has evolved as a participatory platform in which users contribute photos and phenological data to create a personalized risk map. This participatory model has made it possible to maintain a constant flow of information validated by researchers and to offer increasingly complete maps. With more than 1,000 initial downloads and about 65,000 annual visitors to its website, it is a useful tool for people with allergies, educators and researchers.

The project has strengthened its digital presence, with increasing visibility thanks to the support of institutions such as the Autonomous University of Barcelona and the University of Granada, in addition to the promotion carried out by the company Thigis.

Its challenges include expanding geographical coverage beyond Catalonia and Granada and sustaining data participation and validation. Therefore, looking to the future, it seeks to extend its territorial reach, strengthen collaboration with schools and communities, integrate more data in real time and improve its predictive capabilities.

Throughout this time, Planttes has established herself as an example of how citizen-driven science can improve public health and environmental awareness, demonstrating the value of citizen science in environmental education, allergy management, and climate change monitoring.

Tangible data

The project transforms datasets into physical sculptures that represent global challenges such as climate change or poverty, integrating QR codes and NFC to contextualize the information. Recognized at the EU Open Data Days 2025, Tangible Data has inaugurated its installation Tangible climate at the National Museum of Natural Sciences in Madrid.

Tangible Data has evolved in three years from a prototype project based on 3D sculptures to visualize sustainability data to become an educational and cultural platform that connects open data with society. Volume III of the Use Case Observatory reflects its expansion into schools and museums, the creation of an educational program for 15-year-old students, and the development of interactive experiences with artificial intelligence, consolidating its commitment to accessibility and social impact.

Its challenges include funding and scaling up the education programme, while its future goals include scaling up school activities, displaying large-format sculptures in public spaces,  and strengthening collaboration with artists and museums. Overall, it remains true to its mission of making data tangible, inclusive, and actionable.

UniversiDATA-Lab

UniversiDATA-Lab is a dynamic repository of analytical applications based on open data from Spanish universities, created in 2020 as a public-private collaboration and currently made up of six institutions. Its unified infrastructure facilitates the publication and reuse of data in standardized formats, reducing barriers and allowing students, researchers, companies and citizens to access useful information for education, research and decision-making.

Over the past three years, the project has grown from a prototype to a consolidated platform, with active applications such as the budget and retirement viewer, and a hiring viewer in beta. In addition, it organizes a periodic datathon that promotes innovation and projects with social impact.

Its challenges include internal resistance at some universities and the complex anonymization of sensitive data, although it has responded with robust protocols and a focus on transparency. Looking to the future, it seeks to expand its catalogue, add new universities and launch applications on emerging issues such as school dropouts, teacher diversity or sustainability, aspiring to become a European benchmark in the reuse of open data in higher education.

Conclusion

In conclusion, the third volume of the Use Case Observatory confirms that open data has established itself as a key tool to boost innovation, transparency and sustainability in Europe. The projects analysed – and in particular the Spanish initiatives Planttes, Tangible Data and UniversiDATA-Lab – demonstrate that the reuse of public information can translate into concrete benefits for citizens, education, research and the environment.

calendar icon
Blog

In any data management environment (companies, public administration, consortia, research projects), having data is not enough: if you don't know what data you have, where it is, what it means, who maintains it, with what quality, when it changed or how it relates to other data, then the value is very limited. Metadata —data about data—is essential for:

  • Visibility and access: Allow users to find what data exists and can be accessed.

  • Contextualization: knowing what the data means (definitions, units, semantics).

  • Traceability/lineage: Understanding where data comes from and how it has been transformed.

  • Governance and control: knowing who is responsible, what policies apply, permissions, versions, obsolescence.

  • Quality, integrity, and consistency: Ensuring data reliability through rules, metrics, and monitoring.

  • Interoperability:  ensuring that different systems or domains can share data, using a common vocabulary, shared definitions, and explicit relationships.

In short, metadata is the lever that turns "siloed" data into a governed information ecosystem. As data grows in volume, diversity, and velocity, its function goes beyond simple description: metadata adds context, allows data to be interpreted, and makes  it findable, accessible, interoperable, and reusable (FAIR).

In the new context driven by artificial intelligence, this metadata layer becomes even more relevant, as it provides the provenance information needed to ensure traceability, reliability, and reproducibility of results. For this reason, some recent frameworks extend these principles to FAIR-R, where the additional "R" highlights the importance of data being AI-ready, i.e. that it meets a series of technical, structural and quality requirements that optimize its use by artificial intelligence algorithms.

Thus, we are talking about enriched metadata, capable of connecting technical, semantic and contextual information to enhance machine learning, interoperability between domains and the generation of verifiable knowledge.

From traditional metadata to "rich metadata"

Traditional metadata

In the context of this article, when we talk about metadata with a traditional use, we think of catalogs, dictionaries, glossaries, database data models, and rigid structures (tables and columns). The most common types of metadata are:

  • Technical metadata: column type, length, format, foreign keys, indexes, physical locations.

  • Business/Semantic Metadata: Field Name, Description, Value Domain, Business Rules, Business Glossary Terms.

  • Operational/execution metadata: refresh rate, last load, processing times, usage statistics.

  • Quality metadata: percentage of null values, duplicates, validations.

  • Security/access metadata: access policies, permissions, sensitivity rating.

  • Lineage metadata: Transformation tracing in data pipelines .

This metadata is usually stored in repositories or cataloguing tools, often with tabular structures or relational bases, with predefined links.

Why "rich metadata"?

Rich metadata is a layer that not only describes attributes, but also:

  • They discover and infer implicit relationships, identifying links that are not expressly defined in data schemas. This allows, for example, to recognize that two variables with different names in different systems actually represent the same concept ("altitude" and "elevation"), or that certain attributes maintain a hierarchical relationship ("municipality" belongs to "province").
  • They facilitate semantic queries and automated reasoning, allowing users and machines to explore relationships and patterns that are not explicitly defined in databases. Rather than simply looking for exact matches of names or structures, rich metadata allows you to ask questions based on meaning and context. For example, automatically identifying all datasets related to "coastal cities" even if the term does not appear verbatim in the metadata.
  • They adapt and evolve flexibly, as they can be extended with new entity types, relationships, or domains without the need to redesign the entire catalog structure. This allows new data sources, models or standards to be easily incorporated, ensuring the long-term sustainability of the system.
  • They incorporate automation into tasks that were previously manual or repetitive, such as duplication detection, automatic matching of equivalent concepts, or semantic enrichment using machine learning. They can also identify inconsistencies or anomalies, improving the quality and consistency of metadata.
  • They explicitly integrate the business context, linking each data asset to its operational meaning and its role within organizational processes. To do this, they use controlled vocabularies, ontologies or taxonomies that facilitate a common understanding between technical teams, analysts and business managers.
  • They promote deeper interoperability between heterogeneous domains, which goes beyond the syntactic exchange facilitated by traditional metadata. Rich metadata adds a semantic layer that allows you to understand and relate data based on its meaning, not just its format. Thus, data from different sources or sectors – for example, Geographic Information Systems (GIS), Building Information Modeling (BIM) or the Internet of Things (IoT) – can be linked in a coherent way within a shared conceptual framework. This semantic interoperability is what makes it possible to integrate knowledge and reuse information between different technical and organizational contexts.

This turns metadata into a living asset, enriched and connected to domain knowledge, not just a passive "record".

The Evolution of Metadata: Ontologies and Knowledge Graphs

The incorporation of ontologies and knowledge graphs represents a conceptual evolution in the way metadata is described, related and used, hence we speak of enriched metadata. These tools not only document the data, but connect them within a network of meaning, allowing the relationships between entities, concepts, and contexts to be explicit and computable.

In the current context, marked by the rise of artificial intelligence, this semantic structure takes on a fundamental role:  it provides algorithms with the contextual knowledge necessary to interpret, learn and reason about data in a more accurate and transparent way. Ontologies and graphs allow AI systems not only to process information, but also  to understand the relationships between elements and to generate grounded inferences, opening the way to more explanatory and reliable models.

This paradigm shift transforms metadata into a dynamic structure, capable of reflecting the complexity of knowledge and facilitating semantic interoperability between different domains and sources of information. To understand this evolution, it is necessary to define and relate some concepts:


Ontologies

In the world of data, an ontology is a highly organized conceptual map that clearly defines:

  • What entities exist (e.g., city, river, road).
  • What properties they have (e.g. a city has a name, town, zip code).
  • How they relate to each other (e.g. a river runs through a city, a road connects two municipalities).

The goal is for people and machines to share the same vocabulary and understand data in the same way. Ontologies allow:

  • Define concepts and relationships: for example, "a plot belongs to a municipality", "a building has geographical coordinates".
  • Set rules and restrictions: such as "each building must be exactly on a cadastral plot".
  • Unify vocabularies: if in one system you say "plot" and in another "cadastral unit", ontology helps to recognize that they are analogous.
  • Make inferences: from simple data, discover new knowledge (if a building is on a plot and the plot in Seville, it can be inferred that the building is in Seville).
  • Establish a common language: they work as a dictionary shared between different systems or domains (GIS, BIM, IoT, cadastre, urban planning).

In short: an ontology is the dictionary and the rules of the game that allow different geospatial systems (maps, cadastre, sensors, BIM, etc.) to understand each other and work in an integrated way.

Knowledge Graphs

A knowledge graph is a way of organizing information as if it were a network of concepts connected to each other.

  • Nodes represent things or entities, such as a city, a river, or a building.

  • The edges (lines) show the relationships between them, for example: "is in", "crosses" or "belongs to".

  • Unlike a simple drawing of connections, a knowledge graph also explains the meaning of those relationships: it adds semantics.

A knowledge graph combines three main elements:

  1. Data: specific cases or instances, such as "Seville", "Guadalquivir River" or "Seville City Hall Building".

  2. Semantics (or ontology): the rules and vocabularies that define what kinds of things exist (cities, rivers, buildings) and how they can relate to each other.

  3. Reasoning: the ability to discover new connections from existing ones (for example, if a river crosses a city and that city is in Spain, the system can deduce that the river is in Spain).

In addition, knowledge graphs make it possible to connect information from different fields (e.g. data on people, places and companies) under the same common language, facilitating analysis and interoperability between disciplines.

In other words, a knowledge graph is the result of applying an ontology (the data model) to several individual datasets (spatial elements, other territory data, patient records or catalog products, etc.). Knowledge graphs are ideal for integrating heterogeneous data, because they do not require a previously complete rigid schema: they can be grown flexibly. In addition, they allow semantic queries and navigation with complex relationships. Here's an example for spatial data to understand the differences:

Spatial data ontology (conceptual model)

Knowledge graph (specific examples with instances)

  • Classes: River, Ocean, Building, Road, City.
  • Specific nodes: ‘Guadalquivir River’, ‘Atlantic Ocean’, ‘Seville City Hall Building’, ‘A-4 Motorway’, ‘City of Seville’, ‘City of Cadiz’
  • Relationships:
    • River → flows into → Ocean.
    • City → contains → Building
    • Road → connects → City
  • Relationships:
    • Guadalquivir River → flows into → Atlantic Ocean
    • City of Seville → contains → Seville City Hall building
    • A-4 motorway → connects → City of Seville and City of Cadiz.

Use Cases

To better understand the value of smart metadata and semantic catalogs, there is nothing better than looking at examples where they are already being applied. These cases show how the combination of ontologies and knowledge graphs makes it possible to connect dispersed information, improve interoperability and generate actionable knowledge in different contexts.

From emergency management to urban planning or environmental protection, different international projects have shown that semantics is not just theory, but a practical tool that transforms data into decisions.

Some relevant examples include:

  • LinkedGeoData that converted OpenStreetMap data into Linked Data, linking it to other open sources.
  • Virtual Singapore is a 3D digital twin that integrates geospatial, urban and real-time data for simulation and planning.
  • JedAI-spatial is a tool for interconnecting 3D spatial data using semantic relationships.
  • SOSA Ontology, a standard widely used in sensor and IoT projects for environmental observations with a geospatial component.
  • European projects on digital building permits (e.g. ACCORD), which combine semantic catalogs, BIM models, and GIS data to automatically validate building regulations.

Conclusions

The evolution towards rich metadata, supported by ontologies, knowledge graphs and FAIR-R principles, represents a substantial change in the way data is managed, connected and understood. This new approach makes metadata an active component of the digital infrastructure, capable of providing context, traceability and meaning, and not just describing information.

Rich metadata allows you to learn from data, improve semantic interoperability between domains, and facilitate more expressive queries, where relationships and dependencies can be discovered in an automated way. In this way, they favor the integration of dispersed information and support both informed decision-making and the development of more explanatory and reliable artificial intelligence models.

In the field of open data, these advances drive the transition from descriptive repositories to ecosystems of interconnected knowledge, where data can be combined and reused in a flexible and verifiable way. The incorporation of semantic context and provenance reinforces transparency, quality and responsible reuse.

This transformation requires, however, a progressive and well-governed approach: it is essential to plan for systems migration, ensure semantic quality, and promote the participation of multidisciplinary communities.

In short, rich metadata is the basis for moving from isolated data to connected and traceable knowledge, a key element for interoperability, sustainability and trust in the data economy.

Content prepared by Mayte Toscano, Senior Consultant in Data Economy Technologies. The contents and points of view reflected in this publication are the sole responsibility of the author.

calendar icon
Evento

The Provincial Council of Bizkaia has launched the Data Journalism Challenge, a competition aimed at rewarding creativity, rigour and talent in the use of open data. This initiative seeks to promote journalistic projects that use the public data available on the Open Data Bizkaia platform  to create informative content with a strong visual component. Whether through interactive graphics, maps, animated videos or in-depth reports, the goal is  to transform data into narratives that connect with citizens.

Who can participate?

The call is open to individuals over 18 years of age, both individually and in teams of up to four members. Each participant may submit proposals in one or more of the available categories.

It is an opportunity of special relevance for students, entrepreneurs, developers, design professionals or journalists with an interest in open data.

Three categories to boost the use of open data

The competition is divided into three categories, each with its own approach and evaluation criteria:

  1. Dynamic data representation: Projects that present data in an interactive, clear, and visually appealing way.

  2. Data storytelling through animated video: audiovisual narratives that explain phenomena or trends using public data.

  3. Reporting + Data: journalistic articles that integrate data analysis with research and depth of information.

As we have previously mentioned, all projects must be based on the public data available on the Open Data Bizkaia platform, which offers information on multiple areas: economy, environment, mobility, health, culture, etc. It is a rich and accessible source for building relevant and well-grounded stories.

Up to 4,500 euros in prizes

For each category, the following prizes will be awarded:

  • First place: 1,500 euros

  • Second place: 750 euros

The prizes will be subject to the corresponding tax withholdings. Since the same person can submit proposals to several categories, and these will be evaluated independently, it is possible for a single participant to win more than one prize. Therefore, a single participant will be able to win up to 4,500 euros, if they win in all three categories.

What are the evaluation criteria?

The awards will be made through the competitive concurrence procedure. All the projects received in the period enabled for this will be evaluated by the jury, according to a series of specific criteria for each category:

  1. Dynamic data representation:

  • Communicative clarity (30%)

  • Interactivity (25%)

  • Design and usability (20%)

  • Originality in representation (15%)

  • Rigor and fidelity of data (10%)

  1. Data storytelling in animation video

  • Narrative and script (30%)

  • Visual creativity and technical innovation (25%)

  • Informational clarity (20%)

  • Emotional and aesthetic impact (15%)

  • Rigorous and honest use of data (10%)

  1. Feature + Data

  • Journalistic quality and analytical depth (30%)

  • Narrative integration of data (25%)

  • Originality in approach and format (20%)

  • Design and user experience (15%)

  • Transparency and traceability of sources (10%)

How are applications submitted?

The deadline for submitting projects began on November 3 and will be open until December 3, 2025 at 11:59 p.m. Applications may be submitted in a variety of ways:

  • Electronically, through the electronic office of Bizkaia, using the procedure code 2899.

  • In person, at the General Registry of the Laguntza Office (c/ Diputación, 7, Bilbao), at any other public registry or at the Post Office.

In the case of group projects, a single application signed by a representative must be submitted. This person will assume the dialogue with the organizing General Directorate, taking care of the procedures and the fulfillment of the corresponding obligations.

The documentation that must be submitted is:

  • The project to be evaluated.

  • The certificate of being up to date with tax obligations.

  • The certificate of being up to date with Social Security obligations.

  • The direct debit form, only in the event that the applicant objects to this Administration checking the bank details by its own means.

Contact Information

For queries or additional information, please contact the Provincial Council of Bizkaia. Specifically, with the Department of Public Administration and Institutional Relations, Technical Advisory Section c/ Gran Vía, 2 (48009) in the city of Bilbao. Doubts will also be answered by calling 944 068 000 and by email SAT@bizkaia.eus.

This competition represents an opportunity to explore the potential of data journalism and contribute to more transparent and accessible communication. The projects presented will be able to highlight the potential of open data to facilitate the understanding of issues of public interest, in a clear and simple way.

For more details, it is recommended to read the information 

calendar icon
Noticia

On October 6, the V Open Government Plan was approved, an initiative that gives continuity to the commitment of public administrations to transparency, citizen participation and accountability. This new plan, which will be in force until 2029, includes 218 measures grouped into 10 commitments that affect the various levels of the Administration.

In this article we are going to review the key points of the Plan, focusing on those commitments related to data and access to public information.

A document resulting from collaboration

The process of preparing the V Open Government Plan has been developed in  a participatory and collaborative way, with the aim of collecting proposals from different social actors. To this end, a public consultation was opened in which citizens, civil society organizations and institutional representatives were able to contribute ideas and suggestions. A series of deliberative workshops were also held. In total, 620 contributions were received from civil society and more than 300 proposals from ministries, autonomous communities and cities, and representatives of local entities.

These contributions were analysed and integrated into the plan's commitments, which were subsequently validated by the Open Government Forum. The result is a document that reflects a shared vision on how to advance transparency, participation and accountability in the public administrations as a whole.

10 main lines of action with a prominent role for open data

As a result of this collaborative work, 10 lines of action have been established. The first nine commitments include initiatives from the General State Administration (AGE), while the tenth groups together the contributions of autonomous communities and local entities:

  1. Participation and civic space.
  2. Transparency and access to information.
  3. Integrity and accountability.
  4. Open administration.
  5. Digital governance and artificial intelligence.
  6. Fiscal openness: clear and open accounts.
  7. Truthful information / information ecosystem.
  8. Dissemination, training and promotion of open government.
  9. Open Government Observatory.
  10. Open state.


Figure 1. 10 lines of action of the V Open Government Plan. Source: Ministry of Inclusion, Social Security and Migration.

Data and public information are a key element in all of them. However, most of the measures related to this field are found within line of action 2, where there is a specific section on opening and reusing public information data. Among the measures envisaged, the following are contemplated:

  • Data governance model: it is proposed to create a regulatory framework that facilitates the responsible and efficient use of public data in the AGE. It includes the regulation of collegiate bodies for the exchange of data, the application of European regulations and the creation of institutional spaces to design public policies based on data.
  • Data strategy for a citizen-centred administration: it seeks to establish a strategic framework for the ethical and transparent use of data in the Administration.
  • Publication of microdata from electoral surveys: the Electoral Law will be amended to include the obligation to publish anonymized microdata from electoral surveys. This improves the reliability of studies and facilitates open access to individual data for analysis.
  • Support for local entities in the opening of data: a grant program has been launched to promote the opening of homogeneous and quality data in local entities through calls and/or collaboration agreements. In addition, its reuse will be promoted through awareness-raising actions, development of demonstrator solutions and inter-administrative collaboration to promote public innovation.
  • Openness of data in the Administration of Justice: official data on justice will continue to be published on public portals, with the aim of making the Administration of Justice more transparent and accessible.
  • Access and integration of high-value geospatial information: the aim is to facilitate the reuse of high-value spatial data in categories such as geospatial, environment and mobility. The measure includes the development of digital maps, topographic bases and an API to improve access to this information by citizens, administrations and companies.
  • Open data of the BORME: work will be done to promote the publication of the content of the Official Gazette of the Mercantile Registry, especially the section on entrepreneurs, as open data in machine-readable formats and accessible through APIs.
  • Databases of the Central Archive of the Treasury:  the public availability of the records of the Central Archive of the Ministry of Finance that do not contain personal data or are not subject to legal restrictions is promoted.
  • Secure access to confidential public data for research and innovation: the aim is to establish a governance framework and controlled environments that allow researchers to securely and ethically access public data subject to confidentiality.
  • Promotion of the secondary use of health data: work will continue on the National Health Data Space (ENDS), aligned with European regulations, to facilitate the use of health data for research, innovation and public policy purposes. The measure includes the promotion of technical infrastructures, regulatory frameworks and ethical guarantees to protect the privacy of citizens.
  • Promotion of data ecosystems for social progress: it seeks to promote collaborative data spaces between public and private entities, under clear governance rules. These ecosystems will help develop innovative solutions that respond to social needs, fostering trust, transparency and the fair return of benefits to citizens.
  • Enhancement of quality public data for citizens and companies: the generation of quality data will continue to be promoted in the different ministries and agencies, so that they can be integrated into the AGE's centralised catalogue of reusable information.
  • Evolution of the datos.gob.es platform: work continues on the optimization of datos.gob.es. This measure is part of a continuous enrichment to address changing citizen needs and emerging trends.

In addition to this specific heading, measures related to open data are also included in other sections. For example, measure 3.5.5 proposes to transform the Public Sector Procurement Platform into an advanced tool that uses Big Data and Artificial Intelligence to strengthen transparency and prevent corruption. Open data plays a central role here, as it allows massive audits and statistical analyses to be carried out to detect irregular patterns in procurement processes. In addition, by facilitating citizen access to this information, social oversight and democratic control over the use of public funds are promoted.

Another example can be found in measure 4.1.1, where it is proposed to develop a digital tool for the General State Administration that incorporates the principles of transparency and open data from its design. The system would allow the traceability, conservation, access and reuse of public documents, integrating archival criteria, clear language and document standardization. In addition, it would be linked to the National Open Data Catalog to ensure that information is available in open and reusable formats.

The document not only highlights the possibilities of open data: it also highlights the opportunities offered by Artificial Intelligence both in improving access to public information and in the generation of open data useful for collective decision-making.

Promotion of open data in the Autonomous Communities and Cities

As mentioned above, the IV Open Government Plan also includes commitments made by regional bodies, which are detailed in line of action 10 on Open State, many of them focused on the availability of public data. 

For example, the Government of Catalonia reports its interest in optimising the resources available for the management of requests for access to public information, as well as in publishing disaggregated data on public budgets in areas related to children or climate change. For its part, the Junta de Andalucía wants to promote access to information on scientific personnel and scientific production, and develop a Data Observatory of Andalusian public universities, among other measures. Another example can be found in the Autonomous City of Melilla, which is working on an Open Data Portal.

With regard to the local administration, the commitments have been set through the Spanish Federation of Municipalities and Provinces (FEMP). The Network of Local Entities for Transparency and Citizen Participation of the FEMP proposes that local public administrations publish, at least, to choose from the following fields: street; budgets and budget execution; subsidies; public contracting and bidding; municipal register; vehicle census; waste and recycling containers; register of associations; cultural agenda; tourist accommodation; business areas and Industrial; Census of companies or economic agents.

All these measures highlight the interest in open data in Spanish institutions as a key tool to promote open government, promote services and products aligned with citizen needs and optimize decision-making.

A tracking system

The follow-up of the V Open Government Plan is based on a strengthened system of accountability and the strategic use of the HazLab digital platform, where five working groups are hosted, one of them focused on transparency and access to information.

Each initiative of the Plan also has a monitoring file with information on its execution, schedule and results, periodically updated by the responsible units and published on the Transparency Portal.

Conclusions

Overall, the V Open Government Plan seeks a more transparent, participatory Administration oriented to the responsible use of public data. Many of the measures included aim to strengthen the openness of information, improve document management and promote the reuse of data in key sectors such as health, justice or public procurement. This approach not only facilitates citizen access to information, but also promotes innovation, accountability, and a more open and collaborative culture of governance.

calendar icon
Blog

Artificial Intelligence (AI) is becoming one of the main drivers of increased productivity and innovation in both the public and private sectors, becoming increasingly relevant in tasks ranging from the creation of content in any format (text, audio, video) to the optimization of complex processes through Artificial Intelligence agents.

However, advanced AI models, and in particular large language models, require massive amounts of data for training, optimization, and evaluation. This dependence generates a paradox: at the same time as AI demands more and higher quality data, the growing concern for privacy and confidentiality (General Data Protection Regulation or GDPR), new data access and use rules (Data Act), and quality and governance requirements for high-risk systems (AI Regulation), as well as the inherent scarcity of data in sensitive domains limit access to actual data.

In this context, synthetic data can be an enabling mechanism to achieve new advances, reconciling innovation and privacy protection. On the one hand, they allow AI to be nurtured without exposing sensitive information, and when combined with quality open data, they expand access to domains where real data is scarce or heavily regulated.

What is synthetic data and how is it generated?

Simply put, synthetic data can be defined as artificially fabricated information that mimics the characteristics and distributions of real data. The main function of this technology is to reproduce the statistical characteristics, structure and patterns of the underlying real data. In the domain of official statistics, there are cases such as the United States Census , which publishes partially or totally synthetic products such as OnTheMap (mobility of workers between place of residence and workplace) or SIPP Synthetic Beta (socioeconomic microdata linked to taxes and social security).

The generation of synthetic data is currently a field still in development that is supported by various methodologies. Approaches can range from rule-based methods or statistical modeling (simulations, Bayesian, causal networks), which mimic predefined distributions and relationships, to advanced deep learning techniques. Among the most outstanding architectures we find:

  • Generative Adversarial Networks (GANs): a generative model, trained on real data, learns to mimic its characteristics, while a discriminator tries to distinguish between real and synthetic data. Through this iterative process, the generator improves its ability to produce artificial data that is statistically indistinguishable from the originals. Once trained, the algorithm can create new artificial records that are statistically similar to the original sample, but completely new and secure.

  • Variational Selfencoders (VAE): These models are based on neural networks that learn a probabilistic distribution in a latent space of the input data. Once trained, the model uses this distribution to obtain new synthetic observations by sampling and decoding the latent vectors. VAEs are often considered a more stable and easier option to train compared to GANs for tabular data generation.

  • Autoregressive/hierarchical models and domain simulators: used, for example, in electronic medical record data, which capture temporal and hierarchical dependencies. Hierarchical models structure the problem by levels, first sampling higher-level variables and then lower-level variables conditioned to the previous ones. Domain simulators code process rules and calibrate them with real data, providing control and interpretability and ensuring compliance with business rules.

You can learn more about synthetic data and how it's created in this infographic:

 

Figure 1. Infographic on synthetic data. Source: Authors' elaboration - datos.gob.es.

While synthetic generation inherently reduces the risk of personal data disclosure, it does not eliminate it entirely. Synthetic does not automatically mean anonymous because, if the generators are trained inappropriately, traces of the real set can leak out and be vulnerable to membership inference attacks. Hence, it is necessary to use Privacy Enhancing Technologies (PET) such as differential privacy and to carry out specific risk assessments. The European Data Protection Supervisor (EDPS) has also underlined the need to carry out a privacy assurance assessment before synthetic data can be shared, ensuring that the result does not allow re-identifiable personal data to be obtained.

 Differential Privacy (PD) is one of the main technologies in this domain. Its mechanism is to add controlled noise to the training process or to the data itself, mathematically ensuring that the presence or absence of any individual in the original dataset does not significantly alter the final result of the generation. The use of secure methods, such as Stochastic Gradient Descent with Differential Privacy (DP-SGD), ensures that the samples generated do not compromise the privacy of users who contributed their data to the sensitive set.

What is the role of open data?

Obviously, synthetic data does not appear out of nowhere, it needs real high-quality data as a seed and, in addition, it requires good validation practices. For this reason, open data or data that cannot be opened for privacy-related reasons is, on the one hand, an excellent raw material for learning real-world patterns and, on the other, an independent reference to verify that the synthetic resembles reality without exposing people or companies.

As  a seed of learning, quality open data, such as high-value datasets, with complete metadata, clear definitions and standardized schemas, provide coverage, granularity and timeliness. Where certain sets cannot be made public for privacy reasons, they can be used internally with appropriate safeguards to produce synthetic data that could be released. In health, for example, there are open generators such as Synthea, which produce fictitious medical records without the restrictions on the use of real data.

On the other hand, compared to a synthetic set, open data allows it to act as a verification standard, to contrast distributions, correlations and business rules, as well as to evaluate the usefulness in real tasks (prediction, classification) without resorting to sensitive information. In this sense, there are already works, such as that of the Welsh Government with health data, which have experimented with different indicators. These include total distance of change (TVD), propensity score and performance in machine learning tasks.

How is synthetic data evaluated?

The evaluation of synthetic datasets is articulated through three dimensions that, by their nature, imply a commitment:

  • Fidelity: Measures how close the synthetic data is to replicating the statistical properties, correlations, and structure of the original data.

  • Utility: Measures the performance of the synthetic dataset in subsequent machine learning tasks, such as prediction or classification.

  • Privacy: measures how effectively synthetic data hides sensitive information and the risk that the subjects of the original data can be re-identified.


Figure 2. Three dimensions to evaluate synthetic data. Source: Authors' elaboration - datos.gob.es.

The governance challenge is that  it is not possible to optimize all three dimensions simultaneously. For example, increasing the level of privacy (by injecting more noise through differential privacy) can inevitably reduce statistical fidelity and, consequently, usefulness for certain tasks. The choice of which dimension to prioritize (maximum utility for statistical research or maximum privacy) becomes a strategic decision that must be transparent and specific to each use case.

Synthetic open data?

The combination of open data and synthetic data can already be considered more than just an idea, as there are real cases that demonstrate its usefulness in accelerating innovation and, at the same time, protecting privacy. In addition to the aforementioned OnTheMap or SIPP Synthetic Beta in the United States, we also find examples in Europe and the rest of the world. For example, the  European Commission's Joint Research Centre (JRC) has analysed the role of  AI Generated Synthetic Data in Policy Applicationshighlighting its ability to shorten the life cycle of public policies by reducing the burden of accessing sensitive data and enabling more agile exploration and testing phases. He has also documented applications of multipurpose synthetic populations for mobility, energy, or health analysis, reinforcing the idea that synthetic data act as a cross-sectional enabler.

In the UK, the Office for National Statistics (ONS) conducted a Synthetic Data Pilot to understand the demand for synthetic data. The pilot explored the production of high-quality synthetic microdata generation tools for specific user requirements.

Also in health , advances are observed that illustrate the value of synthetic open data for responsible innovation. The Department of Health of the Western Australian region has promoted a Synthetic Data Innovation Project and sectoral hackathons where realistic synthetic sets are released that allow internal and external teams to test algorithms and services without access to identifiable clinical information, fostering collaboration and accelerating the transition from prototypes to real use cases.

In short, synthetic data offers a promising, although not sufficiently explored, avenue for the development of artificial intelligence applications, as it contributes to the balance between fostering innovation and protecting privacy.

Synthetic data is not a substitute for open data, but rather enhances each other. In particular, they represent an opportunity for public administrations to expand their open data offering  with synthetic versions of sensitive sets for education or research, and to make it easier for companies and independent developers to experiment with regulation and generate greater economic and social value.

Content created by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalisation. The content and views expressed in this publication are the sole responsibility of the author.

calendar icon
Noticia

Spain has taken another step towards consolidating a public policy based on transparency and digital innovation. Through the General State Administration, the Government of Spain has signed its adhesion to the International Open Data Charter, within the framework of the IX Global Summit of the Open Government Partnership that is being held these days in Vitoria-Gasteiz.

With this adhesion, data is recognized as a strategic asset for the design of public policies and the improvement of services. In addition, the importance of its openness and reuse, together with the ethical use of artificial intelligence, as key drivers for digital transformation and the generation of social and economic value is underlined.

What is the International Open Data Charter?

The International Open Data Charter (ODC) is a global initiative that promotes the openness and reuse of public data as tools to improve transparency, citizen participation, innovation, and accountability. This initiative was launched in 2015 and is backed by governments, organizations and experts. Its objective is to guide public entities in the adoption of responsible, sustainable open data policies focused on social impact, respecting the fundamental rights of people and communities. To this end, it promotes six principles:

  • Open data by default: data must be published proactively, unless there are legitimate reasons to restrict it (such as privacy or security).

  • Timely and comprehensive data: data should be published in a complete, understandable and agile manner, as often as necessary to be useful. Its original format should also be respected whenever possible.

  • Accessible and usable data: data should be available in open, machine-readable formats and without technical or legal barriers to reuse. They should also be easy to find.

  • Comparable and interoperable data: institutions should work to ensure that data are accurate, relevant, and reliable, promoting common standards that facilitate interoperability and the joint use of different sources.

  • Data for improved governance and citizen engagement: open data should strengthen transparency, accountability, and enable informed participation of civil society.

  • Data for inclusive development and innovation: open access to data can drive innovative solutions, improve public services, and foster inclusive economic development.

The Open Data Charter also offers resources, guides and practical reports to support governments and organizations in applying its principles, adapting them to each context. Open data will thus be able to drive concrete reforms with a real impact. 

Spain: a consolidated open data policy that places us as a reference model

Adherence to the International Open Data Charter is not a starting point, but a step forward in a consolidated strategy that places data as a fundamental asset for the country's progress. For years, Spain has already had a solid framework of policies and strategies that have promoted the opening of data as a fundamental part of digital transformation:

  • Regulatory framework: Spain has a legal basis that guarantees the openness of data as a general rule, including Law 37/2007 on the reuse of public sector informationLaw 19/2013 on transparency and the application of Regulation (EU) 2022/868 on European data governance. This framework establishes clear obligations to facilitate the access, sharing and reuse of public data throughout the state.
  • Institutional governance: the General Directorate of Data, under the Secretary of State for Digitalisation and Artificial Intelligence (SEDIA), has the mission of boosting the management, sharing and use of data in different productive sectors of the Spanish economy and society. Among other issues, he leads the coordination of open data policy in the General State Administration.
  • Strategic initiatives and practical tools: the Aporta Initiative, promoted by the Ministry for Digital Transformation and Public Service through the Public Business Entity Red.es, has been promoting the culture of open data and its social and economic reuse since 2009. To this end, the datos.gob.es platform centralises access to nearly 100,000 datasets and services made available to citizens by public bodies at all levels of administration. This platform also offers multiple resources (news, analysis, infographics, guides and reports, training materials, etc.) that help to promote data culture. 

To continue moving forward, work is underway on the V Open Government Plan (2025–2029), which integrates specific commitments on transparency, participation, and open data within a broader open government agenda.

All this contributes to Spain positioning, year after year, as a European benchmark in open data.

Next steps: advancing an ethical data-driven digital transformation

Compliance with the principles of the International Open Data Charter will be a transparent and measurable process. SEDIA, through the General Directorate of Data, will coordinate internal monitoring of progress. The Directorate-General for Data will act as a catalyst, promoting a culture of sharing, monitoring compliance with the principles of the Charter and promoting participatory processes to collect input from citizens and civil society.

In addition to the opening of public data, it should be noted that work will continue on the development of an ethical and people-centred digital transformation through actions such as:

  • Creation of sectoral data spaces: the aim is to promote the sharing of public and private data that can be combined in a secure and sovereign way to generate high-impact use cases in strategic sectors such as health, tourism, agribusiness or mobility, boosting the competitiveness of the Spanish economy.
  • Developing ethical and responsible AI: The national open data strategy is key to ensuring that algorithms are trained on high-quality, diverse and representative datasets, mitigating bias and ensuring transparency. This reinforces public trust and promotes a model of innovation that protects fundamental rights.

In short, Spain's adoption of the International Open Data Charter reinforces an already consolidated trajectory in open data, supported by a solid regulatory framework, strategic initiatives and practical tools that have placed the country as a benchmark in the field. In addition, this accession opens up new opportunities for international collaboration, access to expert knowledge and alignment with global standards. Spain is thus moving towards a more robust, inclusive data ecosystem that is geared towards social, economic and democratic impact.

calendar icon