Just a few days ago, the Directorate General of Traffic published the new Framework Programme for the Testing of Automated Vehicles which, among other measures, contemplates "the mandatory delivery of reports, both periodic and final and in the event of incidents, which will allow the DGT to assess the safety of the tests and publish basic information [...] guaranteeing transparency and public trust."
The advancement of digital technology is making it easier for the transport sector to face an unprecedented revolution in autonomous vehicle driving, offering significant improvements in road safety, energy efficiency and mobility accessibility.
The final deployment of these vehicles depends to a large extent on the availability, quality and accessibility of large volumes of data, as well as on an appropriate legal framework that ensures the protection of the various legal assets involved (personal data, trade secrets, confidentiality, etc.), traffic security and transparency. In this context, open data and the reuse of public sector information are essential elements for the responsible development of autonomous mobility, in particular when it comes to ensuring adequate levels of traffic safety.
Data Dependency on Autonomous Vehicles
The technology that supports autonomous vehicles is based on the integration of a complex network of advanced sensors, artificial intelligence systems and real-time processing algorithms, which allows them to identify obstacles, interpret traffic signs, predict the behavior of other road users and, in a collaborative way, plan routes completely autonomously.
In the autonomous vehicle ecosystem, the availability of quality open data is strategic for:
- Improve road safety, so that real-time traffic data can be used to anticipate dangers, avoid accidents and optimise safe routes based on massive data analysis.
- Optimise operational efficiency, as access to up-to-date information on the state of roads, works, incidents and traffic conditions allows for more efficient planning of journeys.
- To promote sectoral innovation, facilitating the creation of new digital tools that facilitate mobility.
Specifically, ensuring the safe and efficient operation of this mobility model requires continuous access to two key categories of data:
- Variable or dynamic data, which offers constantly changing information such as the position, speed and behaviour of other vehicles, pedestrians, cyclists or weather conditions in real time.
- Static data, which includes relatively permanent information such as the exact location of traffic signs, traffic lights, lanes, speed limits or the main characteristics of the road infrastructure.
The prominence of the data provided by public entities
The sources from which such data come are certainly diverse. This is of great relevance as regards the conditions under which such data will be available. Specifically, some of the data are provided by public entities, while in other cases the origin comes from private companies (vehicle manufacturers, telecommunications service providers, developers of digital tools...) with their own interests or even from people who use public spaces, devices and digital applications.
This diversity requires a different approach to facilitating the availability of data under appropriate conditions, in particular because of the difficulties that may arise from a legal point of view. In relation to Public Administrations, Directive (EU) 2019/1024 on open data and the reuse of public sector information establishes clear obligations that would apply, for example, to the Directorate General of Traffic, the Administrations owning public roads or municipalities in the case of urban environments. Likewise, Regulation (EU) 2022/868 on European data governance reinforces this regulatory framework, in particular with regard to the guarantee of the rights of third parties and, in particular, the protection of personal data.
Moreover, some datasets should be provided under the conditions established for dynamic data, i.e. those "subject to frequent or real-time updates, due in particular to their volatility or rapid obsolescence", which should be available "for re-use immediately after collection, through appropriate APIs and, where appropriate, in the form of a mass discharge."
One might even think that the high-value data category is of particular interest in the context of autonomous vehicles given its potential to facilitate mobility, particularly considering its potential to:
- To promote technological innovation, as they would make it easier for manufacturers, developers and operators to access reliable and up-to-date information, essential for the development, validation and continuous improvement of autonomous driving systems.
- Facilitate monitoring and evaluation from a security perspective, as transparency and accessibility of such data are essential prerequisites from this perspective.
- To boost the development of advanced services, since data on road infrastructure, signage, traffic and even the results of tests carried out in the context of the aforementioned Framework Programme constitute the basis for new mobility applications and services that benefit society as a whole.
However, this condition is not expressly included for traffic-related data in the definition made at European level, so that, at least for the time being, public entities should not be required to disseminate the data that apply to autonomous vehicles under the unique conditions established for high-value data. However, at this time of transition for the deployment of autonomous vehicles, it is essential that public administrations publish and keep updated under appropriate conditions for their automated processing, some datasets, such as those relating to:
- Road signs and vertical signage elements.
- Traffic light states and traffic control systems.
- Lane configuration and characteristics.
- Information on works and temporary traffic alterations.
- Road infrastructure elements critical for autonomous navigation.
The recent update of the official catalogue of traffic signs, which comes into force on 1 July 2025, incorporates signs adapted to new realities, such as personal mobility. However, it requires greater specificity with regard to the availability of data relating to signals under these conditions. This will require the intervention of the authorities responsible for road signage.
The availability of data in the context of the European Mobility Area
Based on these conditions and the need to have mobility data generated by private companies and individuals, data spaces appear as the optimal legal and governance environment to facilitate their accessibility under appropriate conditions.
In this regard, the initiatives for the deployment of the European Mobility Data Space, created in 2023, constitute an opportunity to integrate into its design and configuration measures that support the need for access to data required by autonomous vehicles. Thus, within the framework of this initiative, it would be possible to unlock the potential of mobility data , and in particular:
- Facilitate the availability of data under conditions specific to the needs of autonomous vehicles.
- Promote the interconnection of various data sources linked to existing means of transport, but also emerging ones.
- Accelerate the digital transformation of autonomous vehicles.
- Strengthen the digital sovereignty of the European automotive industry, reducing dependence on large foreign technology corporations.
In short, autonomous vehicles can represent a fundamental transformation in mobility as it has been conceived until now, but their development depends, among other factors, on the availability, quality and accessibility of sufficient and adequate data. The Sustainable Mobility Bill currently being processed in Parliament is a great opportunity to strengthen the role of data in facilitating innovation in this area, which would undoubtedly favour the development of autonomous vehicles. To this end, it will be essential, on the one hand, to have a data sharing environment that makes access to data compatible with the appropriate guarantees for fundamental rights and information security; and, on the other hand, to design a governance model that, as emphasised in the Programme promoted by the Directorate-General for Traffic, facilitates the collaborative participation of "manufacturers, developers, importers and fleet operators established in Spain or the European Union", which poses significant challenges in the availability of data.
Content prepared by Julián Valero, Professor at the University of Murcia and Coordinator of the Research Group "Innovation, Law and Technology" (iDerTec). The contents and points of view reflected in this publication are the sole responsibility of its author.
The General direction of traffic (DGT in its Spanish acronym) is the body responsible for ensuring safety and fluidity on the roads in Spain. Among other activities, it is responsible for the issuing of permits, traffic control and the management of infringements.
As a result of its activity, a large amount of data is generated, much of which is made available to the public as open data. These datasets not only promote transparency but are also a tool to encourage innovation and improve road safety through their re-use by researchers, companies, public administrations and interested citizens.
In this article we will review some of these datasets, including application examples.
How to access the DGT's datasets
DGT datasets provide detailed and structured information on various aspects of road safety and mobility in Spain, ranging from accident statistics to vehicle and driver information. The temporal continuity of these datasets, available from the beginning of the century to the present day, enables longitudinal analyses that reflect the evolution of mobility and road safety patterns in Spain.
Users can access datasets in different spaces:
- DGT en cifras. It is a section of the General direction of traffic that offers a centralised access to statistics and key data related to road safety, vehicles and drivers in Spain. This portal includes detailed information on accidents, complaints, vehicle fleet and technical characteristics of vehicles, among other topics. It also provides historical and comparative analyses to assess trends and design strategies to improve road safety.
- National Access Point (NAP). Managed by the DGT, it is a platform designed to centralise and facilitate access to road and traffic data, including updates. This portal has been created under the framework of the European Directive 2010/40/EU and brings together information provided by various traffic management bodies, road authorities and infrastructure operators. Available data includes traffic incidents, electric vehicle charging points, low-emission zones and car park occupancy, among others. It aims to promote interoperability and the development of intelligent solutions that improve road safety and transport efficiency.
While the NAP is focused on real-time data and technological solutions, DGT en cifras focuses on providing statistics and historical information for analysis and decision making. In addition, the NAP collects data not only from the DGT, but also from other agencies and private companies.
Most of these data are available through datos.gob.es.
Examples of DGT datasets
Some examples of datasets that can be found in datos.gob.es are:
- Accidents with casualties: includes detailed information on fatalities, hospitalised and minor injuries, as well as circumstances by road type. This data helps to understand why accidents happen and who is involved, to identify risky situations and to detect dangerous behaviour on the road. It is useful for creating better prevention campaigns, detecting black spots on the roads and helping authorities to make more informed decisions. They are also of interest to public health professionals, urban planners and insurance companies working to reduce accidents and their consequences.
- Census of drivers: provides a complete x-ray of who has a driving licence in Spain. The information is particularly useful for understanding the evolution of the driver fleet, identifying demographic trends and analysing the penetration of different types of licences by territory and gender.
- Car registrations by make and engine capacity: shows which new cars Spaniards bought, organised by brand and engine power. The information allows consumer trends to be identified. This data is valuable for manufacturers, dealers and automotive analysts, who can study the market performance in a given year. They are also useful for researchers in mobility, environment and economics, allowing to analyse the evolution of the Spanish vehicle fleet and its impact in terms of emissions, energy consumption and market trends.
Use cases of DGT datasets
The publication of this data in an open format enables innovation in areas such as accident prevention, the development of safer road infrastructure, the development of evidence-based public policies, and the creation of mobility-related technology applications. Some examples are given below:
DGT's own use of data
The DGT itself reuses its data to create tools that facilitate the visualisation of the information and bring it closer to citizens in a simple way. This is the case of the Traffic status and incident map, which is constantly and automatically updated with the information entered 24 hours a day by the Civil Guard Traffic Group and the heads of the Traffic Management Centres of the DGT, the Generalitat de Catalunya and the Basque Government. Includes information on roads affected by weather phenomena (e.g. ice, floods, etc.) and forecast developments.
In addition, the DGT also uses its data to carry out studies that provide information on certain aspects related to mobility and road safety, which are very useful for decision-making and policy-making. One example is this study which analyses the behaviour and activity of certain groups in traffic accidents in order to propose proactive measures. Another example: this project to implement a computer system that identifies, through geolocation, the critical points with the highest accident rates on roads for their treatment and transfer of conclusions.
Use of data by third parties
The datasets provided by the DGT are also reused by other public administrations, researchers, entrepreneurs and private companies, fostering innovation in this field. Thanks to them, we find apps that allow users to access detailed information about vehicles in Spain (such as technical characteristics and inspection history and other data) or that provide information about the most dangerous places for cyclists.
In addition, its combination with advanced technologies such as artificial intelligence allows extracting even more value from the data, facilitating the identification of patterns and helping citizens and authorities to take preventive measures. One example is the Waze application, which has implemented an artificial intelligence-based functionality to identify and alert drivers to high-crashstretches of road, known as "black spots". This system combines historical accident data with analysis of road characteristics, such as gradients, traffic density and intersections, to provide accurate and highly useful alerts. The application notifies users in advance when they are approaching these dangerous areas, with the aim of reducing risks and improving road safety. This innovation complements the data provided by the DGT, helping to save lives by encouraging more cautious driving.
For those who would like to get a taste and practice with DGT data, from datos.gob.es we have a step-by-step data science exercise. Users will be able to analyse these datasets and use predictive models to estimate the evolution of the electric vehicle in Spain. Documented code development and free-to-use tools are used for this purpose. All the material generated is available for reuse in the GitHub repository of datos.gob.es.
In short, the DGT's data sets offer great opportunities for reuse, even more so when combined with disruptive technologies. This is driving innovation, sustainability and safer, more connected transport, which is helping to transform urban and rural mobility.
Today, transparency, innovation and economic development are key to the progress of public institutions. In this context, the Cabildo of Tenerife has undertaken an ambitious open data project that goes beyond the mere publication of information. The aim of this strategy is to ffacilitate access to information, encourage the reuse of data and generate social and economic value for the island.
Through its open data portal, the Cabildo not only promotes transparency and accountability, but also drives innovation in areas as diverse as tourism, transport and the rural environment.
Scope and objectives of the project
The open data portal of the Cabildo de Tenerife publishes datasets of the entire public sector of the island, including all the local councils of the island. In particular, this includes municipalities with less than 20,000 inhabitants, as established in Law 8/2015 on Island Councils. The main objectives of this project are:
- Strengthen the internal and external culture of data use.
- Increase transparency and accountability through data.
- Generate wealth in society through the reuse of information.
In this website you can find open datasets on tourism, transport, culture and leisure and rural environment, among others. In order to offer this information the Cabildo of Tenerife benefits from the collaboration of various bodies such as:
- Transportes Interurbanos de Tenerife (TITSA)
- Tenerife Island Water Board (CIATF)
- Metropolitan of Tenerife
- SINPROMI (Insular Society for the Promotion of People with Disabilities)
- ITER (Technological Institute of Renewable Energies)
- IASS (Insular Institute for Social and Socio-sanitary Care)
- Agrocabildo
Most downloaded and significant datasets
Some of the portal's most downloaded and significant datasets include:
- Network of weather stations, with updates every 10 minutes.
- Influx of recreational areas, such as Punta Teno and Barranco de Masca, with information on the number of vehicles and people, and the itineraries of trails or routes on the island.
- Indicators of economic development and tourist occupancy, including the number of tourists accommodated by category and area.
- Prices of fruit and vegetable products in Mercatenerife.
- Public transport Origin Destination Matrix, which shows the relationships between the places of origin and destination of journeys made on the island.
The Cabildo's open data project is clearly oriented towards compliance with the Technical Interoperability Standard (NTI) for the Reuse of Information Resources and the DCAT-AP model, areas in which it continues to make progress.
Use cases and applications to incentivise reuse
Beyond publishing open datasets, the Cabildo de Tenerife actively promotes their use through the development of use cases and applications. Examples of this work include:
- Development of a urban development plan in the municipality of Santiago del Teide through the reuse of various datasets.
- Project on meteorological data forecasting.
- Tourist places and traffic scorecards. Specifically:
- On tourism: tourist accommodation places by municipality and type, and the occupancy rate by type of accommodation from 1978 to 2023.
- About traffic: dashboard from Power BI showing the average daily traffic intensity at a station on the island for the years 2021, 2022 and 2023.
On the other hand, in order to promote the reuse of the data on its portal, the Cabildo of Tenerife organises various activities, such as the I Ideas Reuse Competition, in which 25 proposals were received. This competition will be followed by a second edition that will reward the development of applications.
In addition, there are talks and webinars, such as the one organised in collaboration with the Chair of Big Data, Open Data and Blockchain of the University of La Laguna on how to use open data from Tenerife to drive innovation , which you can see again here.
Next steps AI and community building
In order to measure the impact of open data, the Cabildo de Tenerife uses tools such as Google Analytics which allows the analysis of user interaction with the available data. The next big step, as reported by the organisation, will be to implement a virtual assistant with generative AI that will allow:
- Analysing data with natural language.
- Discover trends and correlations.
- Bringing information closer to any citizen.
In parallel, the Cabildo of Tenerife will also continue working on new avenues of collaboration with the island's municipalities and other entities, with the aim of expanding the amount and variety of open data available to citizens.
From datos.gob.es, we encourage development and research professionals, students and citizens to explore, reuse and create value with Tenerife's data.
Bizidata is a platform that visualises, analyses and allows you to download data on bicycle use in Vitoria-Gasteiz and explore how external factors, such as weather and traffic, influence bicycle use.
This application collects and combines the following data:
- Bicycle use: data from the Vitoria-Gasteiz City Council.
- Temperature between 2015 and 2021: Aemet data.
- Temperatures between 2022 and today: Euskalmet (station C040).
- Precipitation: Euskalmet (station C076).
- Traffic: Traffic Directorate of the Basque Government.
Bizidata was the winning application in the web services category of the Euskadi 2024 Open Data Competition.
Open data can transform how we interact with our cities, offering opportunities to improve quality of life. When made publicly available, they enable the development of innovative applications and tools that address urban challenges, from accessibility to road safety and participation.
Real-time information can have positive impacts on citizens. For example, applications that use open data can suggest the most efficient routes, considering factors such as traffic and ongoing construction; information on the accessibility of public spaces can improve mobility for people with disabilities; data on cycling or pedestrian routes encourages greener and healthier modes of transport; and access to urban data can empower citizens to participate in decision-making about their city. In other words, citizen use of open data not only improves the efficiency of the city and its services, but also promotes a more inclusive, sustainable and participatory city.
To illustrate these ideas, this article discusses maps for "navigating" cities, made with open data. In other words, initiatives are shown that improve the relationship between citizens and their urban environment from different aspects such as accessibility, school safety and citizen participation. The first project is Mapcesible, which allows users to map and assess the accessibility of different locations in Spain. The second, Eskola BideApp, a mobile application designed to support safe school routes. And finally, maps that promote transparency and citizen participation in urban management. The first identifies noise pollution, the second locates available services in various areas within 15 minutes and the third displays banks in the city. These maps use a variety of public data sources to provide a detailed overview of different aspects of urban life.
The first initiative is a project of a large foundation, the second a collaborative and local proposal, and the third a personal project. Although they are based on very different approaches, all three have in common the use of public and open data and the vocation to help people understand and experience the city. The variety of origins of these projects indicates that the use of public and open data is not limited to large organisations.
Below is a summary of each project, followed by a comparison and reflection on the use of public and open data in urban environments.
Mapcesible, map for people with reduced mobility
Mapcessible was launched in 2019 to assess the accessibility of various spaces such as shops, public toilets, car parks, accommodation, restaurants, cultural spaces and natural environments.
Figure 1. Mapcesible. Source: https://mapcesible.fundaciontelefonica.com/intro
This project is supported by organizations such as the NGO Spanish Confederation of People with Physical and Organic Disabilities (COCEMFE) and the company ILUNION. It currently has more than 40,000 evaluated accessible spaces and thousands of users.
Figure 2. Mapcesible. Source: https://mapcesible.fundaciontelefonica.com/filters
Mapcesible uses open data as part of its operation. Specifically, the application incorporates fourteen datasets from official bodies, including the Ministry of Agriculture and Environment, city councils of different cities (including Madrid and Barcelona) and regional governments. This open data is combined with information provided by the users of the application, who can map and evaluate the accessibility of the places they visit. This combination of official data and citizen collaboration allows Mapcesible to provide up-to-date and detailed information on the accessibility of various spaces throughout Spain, thus benefiting people with reduced mobility.
Eskola BideAPP, application to define safe school routes.
Eskola BideAPP is an application developed by Montera34 - a team dedicated to data visualisation and the development of collaborative projects - in alliance with the Solasgune Association to support school pathways. Eskola BideAPP has served to ensure that boys and girls can access their schools safely and efficiently. The project mainly uses public data from the OpenStreetMap, e.g. geographical and cartographic data on streets, pavements, crossings, as well as data collected during the process of creating safe routes for children to walk to school in order to promote their autonomy and sustainable mobility.
The application offers an interactive dashboard to visualise the collected data, the generation of paper maps for sessions with students, and the creation of reports for municipal technicians. It uses technologies such as QGIS (a free and open source geographic information system) and a development environment for the R programming language, dedicated to statistical computing and graphics.
The project is divided into three main stages:
- Data collection through questionnaires in classrooms.
- Analysis and discussion of results with the children to co-design personalised routes.
- Testing of the designed routes.
Figure 3. Eskola BideaAPP. Photo by Julián Maguna (Solasgune). Source: https://montera34.com/project/eskola-bideapp/
Pablo Rey, one of the promoters of Montera34 together with Alfonso Sánchez, reports for this article that Eskola BideAPP, since 2019, has been used in eight municipalities, including Derio, Erandio, Galdakao, Gatika, Plentzia, Leioa, Sopela and Bilbao. However, it is currently only operational in the latter two. "The idea is to implement it in Portugalete at the beginning of 2025," he adds.
It''s worth noting the maps from Montera34 that illustrated the effect of Airbnb in San Sebastián and other cities, as well as the data analyses and maps published during the COVID-19 pandemic, which also visualized public data.In addition, Montera34 has used public data to analyse abstention, school segregation, minor contracts or make open data available to the public. For this last project, Montera34 has started with the ordinances of the Bilbao City Council and the minutes of its plenary sessions, so that they are not only available in a PDF document but also in the form of open and accessible data.
Mapas de Madrid sobre contaminación acústica, servicios y ubicación de bancos
Abel Vázquez Montoro has made several maps with open data that are very interesting, for example, the one made with data from the Strategic Noise Map (MER) offered by the Madrid City Council and land registry data. The map shows the noise affecting each building, facade and floor in Madrid.
Figure 4. Noise maps in Madrid. Source: https://madb.netlify.app/.
This map is organised as a dashboard with three sections: general data of the area visible on the map, dynamic 2D and 3D map with configurable options and detailed information on specific buildings. It is an open, free, non-commercial platform that uses free and open source software such as GitLab - a web-based Git repository management platform - and QGIS. The map allows the assessment of compliance with noise regulations and the impact on quality of life, as it also calculates the health risk associated with noise levels, using the attributable risk ratio (AR%).
15-minCity is another interactive map that visualises the concept of the "15-minute city" applied to different urban areas, i.e. it calculates how accessible different services are within a 15-minute walking or cycling radius from any point in the selected city.
Figure 5. 15-minCity. Source: https://whatif.sonycsl.it/15mincity/15min.php?idcity=9166
Finally, "Where to sit in Madrid" is another interactive map that shows the location of benches and other places to sit in public spaces in Madrid, highlighting the differences between rich (generally with more public seating) and poor (with less) neighbourhoods. This map uses the map-making tool, Felt, to visualise and share geospatial information in an accessible way. The map presents different types of seating, including traditional benches, individual seats, bleachers and other types of seating structures.
Figure 6. Where to sit in Madrid. Source: https://felt.com/map/Donde-sentarse-en-Madrid-TJx8NGCpRICRuiAR3R1WKC?loc=40.39689,-3.66392,13.97z
Its maps visualise public data on demographic information (e.g. population data by age, gender and nationality), urban information on land use, buildings and public spaces, socio-economic data (e.g. income, employment and other economic indicators for different districts and neighbourhoods), environmental data, including air quality, green spaces and other related aspects, and mobility data.
What do they have in common?
Name | Promoter | Type of data used | Profit motive | Users | Characteristics |
---|---|---|---|---|---|
Mapcesible | Telefónica Foundation | Combines user-generated and public data (14 open data sets from government agencies) | Non-profit | More than 5.000 | Collaborative app, available on iOS and Android, more than 40,000 mapped accessible points. |
Eskola BideAPP | Montera34 and Solasgune Association | Combines user-generated and public data (classroom questionnaires) and some public data. | Non-profit. | 4.185 | Focus on safe school routes, uses QGIS and R for data processing |
Mapa Estratégico de Ruido (MER) | Madrid City Council | 2D and 3D geographic and visible area data | Non-profit | No data | It allows the assessment of compliance with noise regulations and the impact on quality of life, as it also calculates the associated health risk. |
15 min-City | Sony GSL | Geographic data and services | Non-profit | No data | Interactive map visualising the concept of the "15-minute city" applied to different urban areas. |
MAdB "Dónde sentarse en Madrid" | Private | Public data (demographic, electoral, urban, socio-economic, etc.) | Non-profit | No data | Interactive maps of Madrid |
Figure 7. Comparative table of solutions
These projects share the approach of using open data to improve access to urban services, although they differ in their specific objectives and in the way information is collected and presented. Mapcesible, Eskola BideApp, MAdB and "Where to sit in Madrid" are of great value.
On the one hand, Mapcesible offers unified and updated information that allows people with disabilities to move around the city and access services. Eskola BideApp involves the community in the design and testing of safe routes for walking to school; this not only improves road safety, but also empowers young people to be active agents in urban planning. In the meantime, 15-min city, MER and the maps developed by Vázquez Montoro visualise complex data about Madrid so that citizens can better understand how their city works and how decisions that affect them are made.
Overall, the value of these projects lies in their ability to create a data culture, teaching how to value, interpret and use information to improve communities.
Content created by Miren Gutiérrez, PhD and researcher at the University of Deusto, expert in data activism, data justice, data literacy, and gender disinformation. The contents and viewpoints reflected in this publication are the sole responsibility of the author.
In this episode we will delve into the importance of three related categories of high-value datasets. These are Earth observation and environmental data, geospatial data and mobility data. To tell us about them, we have interviewed two experts in the field:
- Paloma Abad Power, Deputy Director of the National Centre for Geographic Information (CNIG).
- Rafael Martínez Cebolla, geographer of the Government of Aragón.
With them we have explored how these high-value datasets are transforming our environment, contributing to sustainable development and technological innovation.
Listen to the full podcast (only available in Spanish)
Summary of the interview
1. What are high-value datasets and why are their important?
Paloma Abad Power: According to the regulation, high-value datasets are those that ensure highest socio-economic potential and, for this, they must be easy to find, i.e. they must be accessible, interoperable and usable. And what does this mean? That means that the datasets must have their descriptions, i.e. the online metadata, which report the statistics and their properties, and which can be easily downloaded or used.
In many cases, these data are often reference data, i.e. data that serve to generate other types of data, such as thematic data, or can generate added value.
Rafael Martínez Cebolla: They could be defined as those datasets that represent phenomena that are useful for decision making, for any public policy or for any action that a natural or legal person may undertake.
In this sense, there are already some directives, which are not so recent, such as the Water Framework Directive or the INSPIRE Directive, which motivated this need to provide shared data under standards that drive the sustainable development of our society.
2. These high-value data are defined by a European Directive and an Implementing Regulation which dictated six categories of high-value datasets. On this occasion we will focus on three of them: Earth observation and environmental data, geospatial data and mobility data. What do these three categories of data have in common and what specific datasets do they cover?
Paloma Abad Power: In my opinion, these data have in common the geographical component, i.e. they are data located on the ground and therefore serve to solve problems of different nature and linked to society.
Thus, for example, we have, with national coverage, the National Aerial Orthophotography Plan (PNOA), which are the aerial images, the System of Land Occupation Information (SIOSE), cadastral parcels, boundary lines, geographical names, roads, postal addresses, protected sites - which can be both environmental and also castles, i.e. historical heritage- etc. And these categories cover almost all the themes defined by the annexes of the INSPIRE directive.
Rafael Martínez Cebolla: It is necessary to know what is pure geographic information, with a direct geographic reference, as opposed to other types of phenomena that have indirect geographic references. In today's world, 90% of information can be located, either directly or indirectly. Today more than ever, geographic tagging is mandatory for any corporation that wants to implement a certain activity, be it social, cultural, environmental or economic: the implementation of renewable energies, where I am going to eat today, etc. These high-value datasets enhance these geographical references, especially of an indirect nature, which help us to make a decision.
3. Which agencies publish these high-value datasets? In other words, where could a user locate datasets in these categories?
Paloma Abad Power: It is necessary to highlight the role of the National Cartographic System, which is an action model in which the organisations of the NSA (National State Administration) and the autonomous communities participate. It is coordinating the co-production of many unique products, funded by these organisations.
These products are published through interoperable web services. They are published, in this case, by the National Center for Geographic Information (CNIG), which is also responsible for much of the metadata for these products.
They could be located through the Catalogues of the IDEE (Spatial Data Infrastructure of Spain) or the Official Catalogue of INSPIRE Data and Services, which is also included in datos.gob.es and the European Data Portal.
And who can publish? All bodies that have a legal mandate for a product classified under the Regulation. Examples: all the mapping bodies of the Autonomous Communities, the General Directorate of Cadastre, Historical Heritage, the National Statistics Institute, the Geological and Mining Institute (IGME), the Hydrographic Institute of the Navy, the Ministry of Agriculture, Fisheries and Food (MAPA), the Ministry for Ecological Transition and the Demographic Challenge, etc. There are a multitude of organisations and many of them, as I have mentioned, participate in the National Cartographic System, provide the data and generate a single service for the citizen.
Rafael Martínez Cebolla: The National Cartographic System defines very well the degree of competences assumed by the administrations. In other words, the public administration at all levels provides official data, assisted by private enterprise, sometimes through public procurement.
The General State Administration goes up to scales of 1:25,000 in the case of the National Geographic Institute (IGN) and then the distribution of competencies for the rest of the scales is for the autonomous or local administrations. In addition, there are a number of actors, such as hydrographic confederations, state departments or the Cadastre, which have under their competences the legal obligation to generate these datasets.
For me it is an example of how it should be distributed, although it is true that it is then necessary to coordinate very well, through collegiate bodies, so that the cartographic production is well integrated.
Paloma Abad Power: There are also collaborative projects, such as, for example, a citizen map, technically known as an X, Y, Z map, which consists of capturing the mapping of all organisations at national and local level. That is, from small scales 1:1,000,000 or 1:50,000,000 to very large scales, such as 1:1000, to provide the citizen with a single multi-scale map that can be served through interoperable and standardised web services.
4. Do you have any other examples of direct application of this type of data?
Rafael Martínez Cebolla: A clear example was seen with the pandemic, with the mobility data published by the National Institute of Statistics. These were very useful data for the administration, for decision making, and from which we have to learn much more for the management of future pandemics and crises, including economic crises. We need to learn and develop our early warning systems.
I believe that this is the line of work: data that is useful for the general public. That is why I say that mobility has been a clear example, because it was the citizen himself who was informing the administration about how he was moving.
Paloma Abad Power: I am going to contribute some data. For example, according to statistics from the National Cartographic System services, the most demanded data are aerial images and digital terrain models. In 2022 there were 8 million requests and in 2023 there were 19 million requests for orthoimages alone.
Rafael Martínez Cebolla: I would like to add that this increase is also because things are being done well. On the one hand, discovery systems are improved. My general feeling is that there are many successful example projects, both from the administration itself and from companies that need this basic information to generate their products.
There was an application that was generated very quickly with de-escalation - you went to a website and it told you how far you could walk through your municipality - because people wanted to get out and walk. This example arises from spatial data that have moved out of the public administration. I believe that this is the importance of successful examples, which come from people who see a compelling need.
5. And how do you incentivise such re-use?
Rafael Martínez Cebolla: I have countless examples. Incentivisation also involves promotion and marketing, something that has sometimes failed us in the public administration. You stick to certain competences and it seems that just putting it on a website is enough. And that is not all.
We are incentivising re-use in two ways. On the one hand, internally, within the administration itself, teaching them that geographic information is useful for planning and evaluating public policies. And I give you the example of the Public Health Atlas of the Government of Aragon, awarded by an Iberian society of epidemiology the year before the pandemic. It was useful for them to know what the health of the Aragonese was like and what preventive measures they had to take.
As for the external incentives, in the case of the Geographic Institute of Aragon, it was seen that the profile entering the geoportal was very technical. The formats used were also very technical, which meant that the general public was not reached. To solve this problem, we promoted portals such as the IDE didactica, a portal for teaching geography, which reaches any citizen who wants to learn about the territory of Aragon.
Paloma Abad Power: I would like to highlight the economic benefit of this, as was shown, for example, in the economic study carried out by the National Centre for Graphic Information with the University of Leuven to measure the economic benefit of the Spatial Data Infrastructure of Spain. It measure the benefit of private companies using free and open services, rather than using, for example, Google Maps or other non-open sources..
Rafael Martínez Cebolla: For better and for worse, because the quality of the official data sometimes we wish it were better. Both Paloma in the General State Administration and I in the regional administration sometimes know that there are official data where more money needs to be invested so that the quality of the data would be better and could be reusable.
But it is true that these studies are key to know in which dimension high-value datasets move. That is to say, having studies that report on the real benefit of having a spatial data infrastructure at state or regional level is, for me, key for two things: for the citizen to understand its importance and, above all, for the politician who arrives every N years to understand the evolution of these platforms and the revolution in geospatial information that we have experienced in the last 20 years.
6. The Geographic Institute of Aragon has also produced a report on the advantages of reusing this type of data, is that right?
Rafael Martínez Cebolla: Yes, it was published earlier this year. We have been doing this report internally for three or four years, because we knew we were going to make the leap to a spatial knowledge infrastructure and we wanted to see the impact of implementing a knowledge graph within the data infrastructure. The Geographic Institute of Aragon has made an effort in recent years to analyse the economic benefit of having this infrastructure available for the citizens themselves, not for the administration. In other words, how much money Aragonese citizens save in their taxes by having this infrastructure. Today we know that having a geographic information platform saves approximately 2 million euros a year for the citizens of Aragon.
I would like to see the report for the next January or February, because I think the leap will be significant. The knowledge graph was implemented in April last year and this gap will be felt in the year ahead. We have noticed a significant increase in requests, both for viewing and downloading.
Basically from one year to the next, we have almost doubled both the number of accesses and downloads. This affects the technological component: you have to redesign it. More people are discovering you, more people are accessing your data and, therefore, you have to dedicate more investment to the technological component, because it is being the bottleneck.
7. What do you see as the challenges to be faced in the coming years?
Paloma Abad Power: In my opinion, the first challenge is to get to know the user in order to provide a better service. The technical user, the university students, the users on the street, etc. We are thinking of doing a survey when the user is going to use our geographic information. But of course, such surveys sometimes slow down the use of geographic information. That is the great challenge: to know the user in order to make services more user-friendly, applications, etc. and to know how to get to what they want and give it to them better.
There is also another technical challenge. When the spatial infrastructures began, the technical level was very high, you had to know what a visualisation service was, the metadata, know the parameters, etc. This has to be eliminated, the user can simply say I want, for example, to consult and visualise the length of the Ebro river, in a more user-friendly way. Or for example the word LiDAR, which was the Italian digital model with high accuracy. All these terms need to be made much more user-friendly.
Rafael Martínez Cebolla: Above all, let them be discovered. My perception is that we must continue to promote the discovery of spatial data without having to explain to the untrained user, or even to some technicians, that we must have a data, a metadata, a service..... No, no. Basically it is that generalist search engines know how to find high-value datasets without knowing that there is such a thing as spatial data infrastructure.
It is a matter of publishing the data under friendly standards, under accessible versions and, above all, publishing them in permanent URIs, which are not going to change. In other words, the data will improve in quality, but will never change.
And above all, from a technical point of view, both spatial data infrastructures and geoportals and knowledge infrastructures have to ensure that high-value information nodes are related to each other from a semantic and geographical point of view. I understand that knowledge networks will help in this regard. In other words, mobility has to be related to the observation of the territory, to public health data or to statistical data, which also have a geographical component. This geographical semantic relationship is key for me.
Subscribe to our Soundcloud profile to keep up to date with our podcasts
Interview clips
Clip 1. What are high-value datasets and why are their important?
Clip 2. Where can a user locate geographic data?
Clip 3. How is the reuse of data with a geographic component being encouraged?
Many people use apps to get around in their daily lives. Apps such as Google Maps, Moovit or CityMapper provide the fastest and most efficient route to a destination. However, what many users are unaware of is that behind these platforms lies a valuable source of information: open data. By reusing public datasets, such as those related to air quality, traffic or public transport, these applications can provide a better service.
In this post, we will explore how the reuse of open data by these platforms empowers a smarter and more sustainable urban ecosystem .
Google Maps: aggregates air quality information and transport data into GTFS.
More than a billion people use Google Maps every month around the world. The tech giant offers a free, up-to-date world map that draws its data from a variety of sources, some of them open.
One of the functions provided by the app is information about the air quality in the user's location. The Air Quality Index (AQI) is a parameter that is determined by each country or region. The European benchmark can be consulted on this map which shows air quality by geolocated zones in real time.
To display the air quality of the user's location, Google Maps applies a model based on a multi-layered approach known as the "fusion approach". This method combines data from several input sources and weights the layers with a sophisticated procedure. The input layers are:
- Government reference monitoring stations
- Commercial sensor networks
- Global and regional dispersion models
- Dust and smoke fire models
- Satellite information
- Traffic data
- Ancillary information such as surface area
- Meteorology
In the case of Spain, this information is obtained from open data sources such as the Ministry of Ecological Transition and Demographic Challenge, the Regional Ministry of Environment, Territory and Housing of the Xunta de Galicia or the Community of Madrid. Open data sources used in other countries around the worldcan be found here .
Another functionality offered by Google Maps to plan the best routes to reach a destination is the information on public transport. These data are provided on a voluntary basis by the public companies providing transport services in each city. In order to make this open data available to the user, it is first dumped into Google Transit and must comply with the open public transport standard GTFS (General Transit Feed Specification).
Moovit: reusing open data to deliver real-time information
Moovit is another urban mobility app most used by Spaniards, which uses open and collaborative data to make it easier for users to plan their journeys by public transport.
Since its launch in 2012, the free-to-download app offers real-time information on the different transport options, suggests the best routes to reach the indicated destination, guides users during their journey (how long they have to wait, how many stops are left, when they have to get off, etc.) and provides constant updates in the event of any alteration in the service.
Like other mobility apps , it is also available in offline mode and allows you to save routes and frequent lines in "Favourites". It is also an inclusive solution as it integrates VoiceOver (iOs) or TalkBack (Android) for blind people.
The platform not only leverages open data provided by governments and local authorities, but also collects information from its users, allowing it to offer a dynamic and constantly updated service.
CityMapper: born as a reuser of open mobility data
The CityMapper development team recognises that the application was born with an open DNA that still remains. They reuse open datasets from, for example, OpenStreetMap at global level or RENFE and Cercanías Bilbao at national level. As the application becomes available in more cities, the list of open data reference sources from which it draws information grows.
The platform offers real-time information on public transport routes, including bus, train, metro and bike sharing. It also adds options for walking, cycling and ridesharing. It is designed to provide the most efficient and fastest route to a destinationby integrating data from different modes of transport into a single interface.
As we published in the monographic report "Municipal Innovation through Open Data" CityMapper mainly uses open data from local transport authorities, typically using the GTFS (General Transit Feed Specification) standard . However, when this data is not sufficient or accurate enough, CityMapper combines it with datasets generated by the application's own users who voluntarily collaborate. It also uses data enhanced and managed by the work of the company's own local employees. All this data is combined with artificial intelligence algorithms developed to optimise routes and provide recommendations tailored to users' needs.
In conclusion, the use of open data in transport is driving a significant transformation in the mobility sector in cities. Through their contribution to applications, users can access up-to-date and accurate data, plan their journeys efficiently and make informed decisions. Governments, for their part, have taken on the role of facilitators by enabling the dissemination of data through open platforms, optimising resources and fostering collaboration across sectors. In addition, open data has created new opportunities for developers and the private sector, who have contributed with technological solutions such as Google Maps, Moovit or CityMapper. Ultimately, the potential of open data to transform the future of urban mobility is undeniable.
The digital revolution is transforming municipal services, driven by the increasing adoption of artificial intelligence (AI) technologies that also benefit from open data. These developments have the potential to redefine the way municipalities deliver services to their citizens, providing tools to improve efficiency, accessibility and sustainability. This report looks at success stories in the deployment of applications and platforms that seek to improve various aspects of life in municipalities, highlighting their potential to unlock more of the vast untapped potential of open data and associated artificial intelligence technologies.
The applications and platforms described in this report have a high potential for replicability in different municipal contexts, as they address common problems. Replication of these solutions can take place through collaboration between municipalities, companies and developers, as well as through the release and standardisation of open data.
Despite the benefits, the adoption of open data for municipal innovation also presents significant challenges. The quality, updating and standardisation of data published by local authorities, as well as interoperability between different platforms and systems, must be ensured. In addition, the open data culture needs to be reinforced among all actors involved, including citizens, developers, businesses and public administrations themselves.
The use cases analysed are divided into four sections. Each of these sections is described below and some examples of the solutions included in the report are shown.
Transport and Mobility
One of the most significant challenges in urban areas is transport and mobility management. Applications using open data have proven to be effective in improving these services. For example, applications such as Park4Dis make it easy to locate parking spaces for people with reduced mobility, using data from multiple municipalities and contributions from volunteers. CityMapper, which has gone global, on the other hand, offers optimised public transport routes in real time, integrating data from various transport modes to provide the most efficient route. These applications not only improve mobility, but also contribute to sustainability by reducing congestion and carbon emissions.
Environment and Sustainability
Growing awareness of sustainability has spurred the development of applications that promote environmentally friendly practices. CleanSpot, for example, facilitates the location of recycling points and the management of municipal waste. The application encourages citizen participation in cleaning and recycling, contributing to the reduction of the ecological footprint. Liight gamifies sustainable behaviour by rewarding users for actions such as recycling or using public transport. These applications not only improve environmental management, but also educate and motivate citizens to adopt more sustainable habits.
Optimisation of Basic Public Services
Urban service management platforms, such as Gestdropper, use open data to monitor and control urban infrastructure in real time. These tools enable more efficient management of resources such as street lighting, water networks and street furniture, optimising maintenance, incident response and reducing operating costs. Moreover, the deployment of appointment management systems, such as CitaME, helps to reduce waiting times and improve efficiency in customer service.
Citizen Services Aggregators
Applications that centralise public information and services, such as Badajoz Es Más and AppValencia, improve accessibility and communication between administrations and citizens. These platforms provide real-time data on public transport, cultural events, tourism and administrative procedures, making life in the municipality easier for residents and tourists alike. For example, integrating multiple services into a single application improves efficiency and reduces the need for unnecessary travel. These tools also support local economies by promoting cultural events and commercial services.
Conclusions
The use of open data and artificial intelligence technologies is transforming municipal management, improving the efficiency, accessibility and sustainability of public services. The success stories presented in this report describe how these tools can benefit both citizens and public administrations by making cities smarter, more inclusive and sustainable environments, and more responsive to the needs and well-being of their inhabitants and visitors.
Download here the accesible version of the report.