Blog

How many times have you had a dataset in your hands that you needed to analyze, but you've run into errors, inconsistencies, or formatting issues that have caused you to lose hours of work? The reality is that, although we have more data available every day, we do not always have the necessary tools or knowledge to work with it efficiently.

There are several options to address this process. One of them is Open Data Editor, a free and open-source tool that the Open Knowledge Foundation (OKFN) has designed with  the aim of democratizing access to and exploitation of data.

Key features and functionalities

As indicated by OKFN, this application is designed for people who work with tabular data (Excel, Google Sheets, CSV) and who do not know how to program or do not have access to specialized technical tools. Its no-code approach makes it an accessible alternative that focuses specifically on tabular data cleansing and validation.

The tool implements a process known as "data validation," which involves finding errors in datasets and correcting them efficiently. Also, verify that spreadsheets or datasets contain all the information necessary for others to use. Therefore, it also takes into account interoperability, a very relevant value when it comes to dataset reuse.

Beyond guaranteeing reuse, Open Data Editor also ensures privacy and security thanks to its local architecture, i.e. the data remains on the user's device.

Pilot projects: global impact and tangible results

Although it is a very intuitive tool, the organization makes available to the user a free online course  to learn how to get the most out of it. The course is currently in English, but Spanish translation will be available soon.

In addition to the main course, the Open Knowledge Foundation has implemented a "train the trainer" program  that trains people to teach the course locally in different regions of the world. Within the framework of this training programme, pilot projects are being implemented in different sectors and communities. These pilot projects have focused especially on encouraging access to basic training in quality data analysis tools, something that OKFN believes should not be limited by economic or technological barriers.

The documented use cases show diverse applications ranging from human rights organizations to local government institutions, all leveraging the data validation and cleansing capabilities offered by the tool. The educational approach of Open Data Editor goes beyond the simple use of the tool: it is about training in open data and promoting open and accessible knowledge.

Next steps: Integrating artificial intelligence

The results of this first phase have been so promising that the Open Knowledge Foundation has decided to move towards a second stage, this time incorporating artificial intelligence technologies to further expand the capabilities of the tool. The new version, which offers validation-focused AI support and trust-building features, has just been announced and released.

The philosophy behind this AI integration is to maintain the educational character of the tool. Rather than creating a "black box" that simply provides results, the new functionality will explain every step that artificial intelligence takes, allowing users to understand not only what is being done with their data, but also why certain decisions are being made.

This transparent approach to AI is especially important in the context of open and government data, as we explain in this episode of the datos.gob.es podcast. Open Data Editor users will be able to see how AI identifies potential problems, suggests corrections, and validates data quality, becoming a learning tool as well as a practical utility.

Impact on the open data ecosystem

This new functionality will add to the purpose of offering a sustainable and open tool. It is precisely this commitment to open source that makes Open Data Editor adaptable and improved by the global developer community. To do this, they use the Frictionless Framework as a technological basis, which ensures that the standards used are open and widely adopted in the open data ecosystem.

There is no doubt that the tool is especially aligned with government open data principles, providing public administrations with a way to improve the quality of their data publications without requiring significant investments in technical infrastructure or specialized training. For data journalists and civil society organizations, Open Data Editor offers the ability to work with complex datasets more efficiently, allowing them to focus on analysis and interpretation rather than technical data cleansing.

In short, more than a technical tool, Open Data Editor symbolizes a paradigmatic shift towards the democratization of data analysis. Because its impact extends beyond its immediate functionalities, contributing to a broader ecosystem of open and accessible data.

calendar icon
Blog

In the usual search for tricks to make our prompts more effective, one of the most popular is the activation of the chain of thought. It consists of posing a multilevel problem and asking the AI system to solve it, but not by giving us the solution all at once, but by making visible step by step the logical line necessary to solve it. This feature is available in both paid and free AI systems, it's all about knowing how to activate it.

Originally, the reasoning string was one of many tests of semantic logic that developers put language models through. However, in 2022, Google Brain researchers demonstrated for the first time that providing examples of chained reasoning in the prompt could unlock greater problem-solving capabilities in models.

From this moment on, little by little, it has positioned itself as a useful technique to obtain better results from use, being very questioned at the same time from a technical point of view. Because what is really striking about this process is that language models do not think in a chain: they are only simulating human reasoning before us.

How to activate the reasoning chain

There are two possible ways to activate this process in the models: from a button provided by the tool itself, as in the case of DeepSeek with the "DeepThink" button that activates the R1 model:

 

Graphical User Interface, Application</p>
<p>AI-generated content may be incorrect.

Figure 1. DeepSeek with the "DeepThink" button that activates the R1 model.

Or, and this is the simplest and most common option, from the prompt itself. If we opt for this option, we can do it in two ways: only with the instruction (zero-shot prompting) or by providing solved examples (few-shot prompting).

  • Zero-shot prompting: as simple as adding at the end of the prompt an instruction such as "Reason step by step", or "Think before answering". This assures us that the chain of reasoning will be activated and we will see the logical process of the problem visible.

Graphical User Interface, Text, Application</p>
<p>AI-generated content may be incorrect.

Figure 2. Example of Zero-shot prompting.

  • Few-shot prompting: if we want a very precise response pattern, it may be interesting to provide some solved question-answer examples. The model sees this demonstration and imitates it as a pattern in a new question.​

Text, Application, Letter</p>
<p>AI-generated content may be incorrect.

Figure 3. Example of Few-shot prompting.

Benefits and three practical examples

When we activate the chain of reasoning, we are asking the system to "show" its work in a visible way before our eyes, as if it were solving the problem on a blackboard. Although not completely eliminated, forcing the language model to express the logical steps reduces the possibility of errors, because the model focuses its attention on one step at a time. In addition, in the event of an error, it is much easier for the user of the system to detect it with the naked eye.

When is the chain of reasoning useful? Especially in mathematical calculations, logical problems, puzzles, ethical dilemmas or questions with different stages and jumps (called multi-hop). In the latter, it is practical, especially in those in which you have to handle information from the world that is not directly included in the question.

Let's see some examples in which we apply this technique to a chronological problem, a spatial problem and a probabilistic problem.

  • Chronological reasoning

Let's think about the following prompt:

If Juan was born in October and is 15 years old, how old was he in June of last year?

Graphical User Interface, Text, Application</p>
<p>AI-generated content may be incorrect.

Figure 5. Example of chronological reasoning.

For this example we have used the GPT-o3 model, available in the Plus version of ChatGPT and specialized in reasoning, so the chain of thought is activated as standard and it is not necessary to do it from the prompt. This model is programmed to give us the information of the time it has taken to solve the problem, in this case 6 seconds. Both the answer and the explanation are correct, and to arrive at them the model has had to incorporate external information such as the order of the months of the year, the knowledge of the current date to propose the temporal anchorage, or the idea that age changes in the month of the birthday, and not at the beginning of the year.

  • Spatial reasoning

  • A person is facing north. Turn 90 degrees to the right, then 180 degrees to the left. In what direction are you looking now?

    Graphical User Interface, Text, Application, Email</p>
<p>AI-generated content may be incorrect.

    Figure 6. Example of spatial reasoning.

    This time we have used the free version of ChatGPT, which uses the GPT-4o model by default (although with limitations), so it is safer to activate the reasoning chain with an indication at the end of the prompt: Reason step by step. To solve this problem, the model needs general knowledge of the world that it has learned in training, such as the spatial orientation of the cardinal points, the degrees of rotation, laterality and the basic logic of movement.

  • Probabilistic reasoning

  • In a bag there are 3 red balls, 2 green balls and 1 blue ball. If you draw a ball at random without looking, what's the probability that it's neither red nor blue?

    Text</p>
<p>AI-generated content may be incorrect.

    Figure 7. Example of probabilistic reasoning.

    To launch this prompt we have used Gemini 2.5 Flash, in the Gemini Pro version of Google. The training of this model was certainly included in the fundamentals of both basic arithmetic and probability, but the most effective for the model to learn to solve this type of exercise are the millions of solved examples it has seen. Probability problems and their step-by-step solutions are the model to imitate when reconstructing this reasoning.

    The Great Simulation

    And now, let's go with the questioning. In recent months, the debate about whether or not we can trust these mock explanations has grown, especially since, ideally, the chain of thought should faithfully reflect the internal process by which the model arrives at its answer. And there is no practical guarantee that this will be the case.

    The Anthropic team (creators of Claude, another great language model) has carried out a trap experiment with Claude Sonnet in 2025, to which they suggested a key clue for the solution before activating the reasoned response.

    Think of it like passing a student a note that says "the answer is [A]" before an exam. If you write on your exam that you chose [A] at least in part because of the grade, that's good news: you're being honest and faithful. But if you write down what claims to be your reasoning process without mentioning the note, we might have a problem.

    The percentage of times Claude Sonnet included the track among his deductions was only 25%. This shows that sometimes models generate explanations that sound convincing, but that do not correspond to their true internal logic to arrive at the solution, but are rationalizations a posteriori: first they find the solution, then they invent the process in a coherent way for the user. This shows the risk that the model may be hiding steps or relevant information for the resolution of the problem.

    Closing

    Despite the limitations exposed, as we see in the study mentioned above, we cannot forget that in the original Google Brain research, it was documented that, when applying the reasoning chain, the PaLM model improved its performance in mathematical problems from 17.9% to 58.1% accuracy. If, in addition, we combine this technique with the search in open data to obtain information external to the model, the reasoning improves in terms of being more verifiable, updated and robust.

    However, by making language models "think out loud", what we are really improving in 100% of cases is the user experience in complex tasks. If we do not fall into the excessive delegation of thought to AI, our own cognitive process can benefit. It is also a technique that greatly facilitates our new work as supervisors of automatic processes.


Content prepared by Carmen Torrijos, expert in AI applied to language and communication. The contents and points of view reflected in this publication are the sole responsibility of the author.

calendar icon
Blog

Generative artificial intelligence is beginning to find its way into everyday applications ranging from virtual agents (or teams of virtual agents) that resolve queries when we call a customer service centre to intelligent assistants that automatically draft meeting summaries or report proposals in office environments.

These applications, often governed by foundational language models (LLMs), promise to revolutionise entire industries on the basis of huge productivity gains. However, their adoption brings new challenges because, unlike traditional software, a generative AI model does not follow fixed rules written by humans, but its responses are based on statistical patterns learned from processing large volumes of data. This makes its behaviour less predictable and more difficult to explain, and sometimes leads to unexpected results, errors that are difficult to foresee, or responses that do not always align with the original intentions of the system's creator.

Therefore, the validation of these applications from multiple perspectives such as ethics, security or consistency is essential to ensure confidence in the results of the systems we are creating in this new stage of digital transformation.

What needs to be validated in generative AI-based systems?

Validating generative AI-based systems means rigorously checking that they meet certain quality and accountability criteria before relying on them to solve sensitive tasks.

It is not only about verifying that they ‘work’, but also about making sure that they behave as expected, avoiding biases, protecting users, maintaining their stability over time, and complying with applicable ethical and legal standards. The need for comprehensive validation is a growing consensus among experts, researchers, regulators and industry: deploying AI reliably requires explicit standards, assessments and controls.

We summarize four key dimensions that need to be checked in generative AI-based systems to align their results with human expectations:

  • Ethics and fairness: a model must respect basic ethical principles and avoid harming individuals or groups. This involves detecting and mitigating biases in their responses so as not to perpetuate stereotypes or discrimination. It also requires filtering toxic or offensive content that could harm users. Equity is assessed by ensuring that the system offers consistent treatment to different demographics, without unduly favouring or excluding anyone.
  • Security and robustness: here we refer to both user safety (that the system does not generate dangerous recommendations or facilitate illicit activities) and technical robustness against errors and manipulations. A safe model must avoid instructions that lead, for example, to illegal behavior, reliably rejecting those requests. In addition, robustness means that the system can withstand adversarial attacks (such as requests designed to deceive you) and that it operates stably under different conditions.
  • Consistency and reliability: Generative AI results must be consistent, consistent, and correct. In applications such as medical diagnosis or legal assistance, it is not enough for the answer to sound convincing; it must be true and accurate. For this reason, aspects such as the logical coherence of the answers, their relevance with respect to the question asked and the factual accuracy of the information are validated. Its stability over time is also checked (that in the face of two similar requests equivalent results are offered under the same conditions) and its resilience (that small changes in the input do not cause substantially different outputs).
  • Transparency and explainability: To trust the decisions of an AI-based system, it is desirable to understand how and why it produces them. Transparency includes providing information about training data, known limitations, and model performance across different tests. Many companies are adopting the practice of publishing "model cards," which summarize how a system was designed and evaluated, including bias metrics, common errors, and recommended use cases. Explainability goes a step further and seeks to ensure that the model offers, when possible, understandable explanations of its results (for example, highlighting which data influenced a certain recommendation). Greater transparency and explainability increase accountability, allowing developers and third parties to audit the behavior of the system.

Open data: transparency and more diverse evidence

Properly validating AI models and systems, particularly in terms of fairness and robustness, requires representative and diverse datasets that reflect the reality of different populations and scenarios.

On the other hand, if only the companies that own a system have data to test it, we have to rely on their own internal evaluations. However, when open datasets and public testing standards exist, the community (universities, regulators, independent developers, etc.) can test the systems autonomously, thus functioning as an independent counterweight that serves the interests of society.

A concrete example was given by Meta (Facebook) when it released its Casual Conversations v2 dataset in 2023. It is an open dataset, obtained with informed consent, that collects videos from people from 7 countries (Brazil, India, Indonesia, Mexico, Vietnam, the Philippines and the USA), with 5,567 participants who provided attributes such as age, gender, language and skin tone.

Meta's objective with the publication was precisely to make it easier for researchers to evaluate the impartiality and robustness of AI systems in vision and voice recognition. By expanding the geographic provenance of the data beyond the U.S., this resource allows you to check if, for example, a facial recognition model works equally well with faces of different ethnicities, or if a voice assistant understands accents from different regions.

The diversity that open data brings also helps to uncover neglected areas in AI assessment. Researchers from  Stanford's Human-Centered Artificial Intelligence (HAI) showed in the HELM (Holistic Evaluation of Language Models) project  that many language models are not evaluated in minority dialects of English or in underrepresented languages, simply because there are no quality data in the  most well-known benchmarks.

The community can identify these gaps and create new test sets to fill them (e.g., an open dataset of FAQs in Swahili to validate the behavior of a  multilingual chatbot). In this sense, HELM has incorporated broader evaluations precisely thanks to the availability of open data, making it possible to measure not only the performance of the models in common tasks, but also their behavior in other linguistic, cultural and social contexts. This has contributed to making visible the current limitations of the models and to promoting the development of more inclusive and representative systems of the real world or models more adapted to the specific needs of local contexts, as is the case of the ALIA foundational model, developed in Spain.

In short, open data contributes to democratizing the ability to audit AI systems, preventing the power of validation from residing only in a few. They allow you to reduce costs and barriers as a small development team can test your model with open sets without having to invest great efforts in collecting their own data. This not only fosters innovation, but also ensures that local AI solutions from small businesses are also subject to rigorous validation standards.

The validation of applications based on generative AI is today an unquestionable necessity to ensure that these tools operate in tune with our values and expectations. It is not a trivial process, it requires new methodologies, innovative metrics and, above all, a culture of responsibility around AI. But the benefits are clear, a rigorously validated AI system will be more trustworthy, both for the individual user who, for example, interacts with a chatbot without fear of receiving a toxic response, and for society as a whole who can accept decisions based on these technologies knowing that they have been properly audited. And open data helps to cement this trust by fostering transparency, enriching evidence with diversity, and involving the entire community in the validation of AI systems.


Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The contents and views reflected in this publication are the sole responsibility of the author.

calendar icon
Blog

Imagine you want to know how many terraces there are in your neighbourhood, how the pollen levels in the air you breathe every day are evolving or whether recycling in your city is working well. All this information exists in your municipality's databases, but it sits in spreadsheets and technical documents that only experts know how to interpret.

This is where open data visualisation initiativescome in: they transform those seemingly cold numbers into stories that anyone can understand at a glance.  A colourful graph showing the evolution of traffic on your street, an interactive map showing the green areas of your city, or an infographic explaining how the municipal budget is spent. These tools make public information accessible, useful and, moreover, comprehensible to all citizens.

Moreover, the advantages of this type of solution are not only for the citizens, but also benefit the Administration that carries out the exercise, because it allows:

  • Detect and correct data errors.
  • Add new sets to the portal.
  • Reduce the number of questions from citizens.
  • Generate more trust on the part of society.

Therefore, visualising open data brings government closer to citizens, facilitates informed decision-making, helps public administrations to improve their open data offer and creates a more participatory society where we can all better understand how the public sector works. In this post, we present some examples of open data visualisation initiatives in regional and municipal open data portals.

Visualiza Madrid: bringing data closer to the public

Madrid City Council's open data portal has developed the initiative "Visualiza Madrid", a project born with the specific objective of making open data and its potential reach the general public , transcending specialised technical profiles. As Ascensión Hidalgo Bellota, Deputy Director General for Transparency of Madrid City Council, explained during the IV National Meeting on Open Data, "this initiative responds to the need to democratise access to public information".

Visualiza Madrid currently has 29 visualisations that cover different topics of interest to citizens, from information on hotel and restaurant terraces to waste management and urban traffic analysis. This thematic diversity demonstrates the versatility of visualisations as a tool for communicating information from very diverse sectors of public administration.

 In addition, the initiative has received external recognition this year through the Audaz 2,025 Awards, an initiative of the Spanish chapter of the Open Government Academic Network (RAGA Spain).The initiative has also received external recognition through the Audaz 2,025 Awards.

Castilla y León: comprehensive analysis of regional data

 The Junta de Castilla y León has also developed a portal specialised in analysis and visualisations that stands out for its comprehensive approach to the presentation of regional data. Its visualisation platform offers a systematic approach to the analysis of regional information, allowing users to explore different dimensions of the reality of Castilla y Leónthrough interactive and dynamic tools.

This initiative allows complex information to be presented in a structured and understandable way, facilitating both academic analysis and citizen use of the data. The platform integrates different sources of regional information, creating a coherent ecosystem of visualisations that provides a panoramic view of different aspects of regional management. Among the topics it offers are data on tourism, the labour market and budget execution. All the visualisations are made with open data sets from the regional portal of Castilla y León .

The Castilla y León approach demonstrates how visualisations can serve as a tool for territorial analysis, providing valuable insights on economic, social and demographic dynamics that are fundamental for the planning and evaluation of regional public policies.

Canary Islands: technological integration with interactive widgets .

On the other hand, the Government of the Canary Islands has opted for an innovative strategy through the implementation of widgets that allow the integration of open data visualisations of the Instituto Canario de Estadística (ISTAC) in different platforms and contexts. This technological approach represents a qualitative leap in the distribution and reuse of public data visualisations.

The widgets developed by the Canary Islands make it easier for third parties to embed official visualisations in their own applications, websites or analyses, exponentially expanding the scope and usefulness of Canary Islands open data. This strategy not only multiplies the points of access to public information, but also fosters the creation of a collaborative ecosystem where different actors can benefit from and contribute to the value of open data.

The Canarian initiative illustrates how technology can be used to create scalable and flexible solutions that maximise the impact of investments in open data visualisation, establishing a replicable model for other administrations seeking to amplify the reach of their transparency initiatives.

Lessons learned and best practices

By way of example, the cases analysed reveal common patterns that can serve as a guide for future initiatives. The orientation towards the general public, beyond specialised technical users, emerges as an opportunity factor for the success of these platforms. To maintain the interest and relevance of the visualisations, it is important to offer thematic diversity and to update the data regularly.

Technological integration and interoperability, as demonstrated in the case of the Canary Islands, open up new possibilities to maximise the impact of public investments in data visualisation. Likewise, external recognition and participation in professional networks, as evidenced in the case of Madrid, contribute to continuous improvement and the exchange of best practices between administrations.

In general terms, open data visualisation initiatives represent a very valuable opportunity in the transparency and open government strategy of Spanish public administrations. The cases of Madrid, Castilla y León, as well as the Canary Islands, are examples of the enormous potential for transforming public data into tools for citizen empowerment and improved public management.

The success of these initiatives lies in their ability to connect government information with the real needs of citizens, creating bridges of understanding that strengthen the relationship between administration and society. As these experiences mature and consolidate, it will be crucial to keep the focus on the usability, accessibility and relevance of visualisations, ensuring that open data truly delivers on its promise to contribute to a more informed, participatory and democratic society.

Open data visualisation is not just a technical issue, but a strategic opportunity to redefine public communication and strengthen the foundations of a truly open and transparent administration.

calendar icon
Application

More Cancer Data is an initiative that brings together 24 scientific organisations and patient associations, with the aim of developing an integrated model of knowledge about cancer. This project makes 69 key indicators available to the public, organised into six fundamental dimensions of the cancer journey: health promotion and primary prevention, secondary prevention, early detection, diagnosis, health care, follow-up and end of life.

The main purpose of the application is to promote equity in access to cancer information throughout the country. Through the use of open data, it seeks to improve the accessibility, homogeneity and quality of the information available, thus facilitating more informed decision-making for both professionals and patients.

The indicators have been compiled from primary and official sources, which guarantees the reliability and quality of the data. The sources used include the National Statistics Institute (INE), the information system of the Ministry of Haematology and Haemotherapy (SEHH), the Spanish Society of Palliative Care (SECPAL), as well as other data from the Observatory of the Spanish Association Against Cancer and grey literature. All these resources make it possible to generate a broad and detailed overview of the cancer situation in Spain, which is accessible through this open data portal.

See here the report of the project results and here the methodological information.

calendar icon
Noticia

Promoting the data culture is a key objective at the national level that is also shared by the regional administrations. One of the ways to achieve this purpose is to award those solutions that have been developed with open datasets, an initiative that enhances their reuse and impact on society.

On this mission, the Junta de Castilla y León and the Basque Government have been organising open data competitions for years, a subject we talked about in our first episode of the datos.gob.es podcast that you can listen to here.

In this post, we take a look at the winning projects in the latest editions of the open data competitions in the Basque Country and Castilla y León.

Winners of the 8th Castile and Leon Open Data Competition

In the eighth edition of this annual competition, which usually opens at the end of summer, 35 entries were submitted, from which 8 winners were chosen in different categories.

Ideas category: participants had to describe an idea to create studies, services, websites or applications for mobile devices. A first prize of 1,500€ and a second prize of 500€ were awarded.

  • First prize: Green Guardians of Castilla y León presented by Sergio José Ruiz Sainz. This is a proposal to develop a mobile application to guide visitors to the natural parks of Castilla y León. Users can access information (such as interactive maps with points of interest) as well as contribute useful data from their visit, which enriches the application.
  • Second prize: ParkNature: intelligent parking management system in natural spaces presented by Víctor Manuel Gutiérrez Martín. It consists of an idea to create an application that optimises the experience of visitors to the natural areas of Castilla y León, by integrating real-time data on parking and connecting with nearby cultural and tourist events.

Products and Services Category: Awarded studies, services, websites or applications for mobile devices, which must be accessible to all citizens via the web through a URL. In this category, first, second and third prizes of €2,500, €1,500 and €500 respectively were awarded, as well as a specific prize of €1,500 for students.

  • First prize: AquaCyL from Pablo Varela Vázquez. It is an application that provides information about the bathing areas in the autonomous community.
  • Second prize: ConquistaCyL presented by Markel Juaristi Mendarozketa and Maite del Corte Sanz. It is an interactive game designed for tourism in Castilla y León and learning through a gamified process.
  • Third prize: All the sport of Castilla y León presented by Laura Folgado Galache. It is an app that presents all the information of interest associated with a sport according to the province.
  • Student prizeOtto Wunderlich en Segovia by Jorge Martín Arévalo. It is a photographic repository sorted according to type of monuments and location of Otto Wunderlich's photographs.

Didactic Resource Category: consisted of the creation of new and innovative open didactic resources to support classroom teaching. These resources were to be published under Creative Commons licences. A single first prize of €1,500 was awarded in this category.

  • First prize: StartUp CyL: Business creation through Artificial Intelligence and Open Data presented by José María Pérez Ramos. It is a chatbot that uses the ChatGPT API to assist in setting up a business using open data.

Data Journalism category: awarded for published or updated (in a relevant way) journalistic pieces, both in written and audiovisual media, and offered a prize of €1,500.

Winners of the 5th edition of the Open Data Euskadi Open Data Competition

As in previous editions, the Basque open data portal opened two prize categories: an ideas competition and an applications competition, each of which was divided into several categories. On this occasion, 41 applications were submitted for the ideas competition and 30 for the applications competition.

Idea competition: In this category, two prizes of €3,000 and €1,500 have been awarded in each category.

Health and Social Category

Category Environment and Sustainability

  • First prize: Baratzapp by Leire Zubizarreta Barrenetxea. The idea consists of the development of a software that facilitates and assists in the planning of a vegetable garden by means of algorithms that seek to enhance the knowledge related to the self-consumption vegetable garden, while integrating, among others, climatological, environmental and plot information in a personalised way for the user.
  • Second prize: Euskal Advice by Javier Carpintero Ordoñez. The aim of this proposal is to define a tourism recommender based on artificial intelligence.

General Category

  • First prize: Lanbila by Hodei Gonçalves Barkaiztegi. It is a proposed app that uses generative AI and open data to match curriculum vitae with job offers in a semantic way.. It provides personalised recommendations, proactive employment and training alerts, and enables informed decisions through labour and territorial indicators.
  • Second prize: Development of an LLM for the interactive consultation of Open Data of the Basque Government by Ibai Alberdi Martín. The proposal consists in the development of a Large Scale Language Model (LLM) similar to ChatGPT, specifically trained with open data, focused on providing a conversational and graphical interface that allows users to get accurate answers and dynamic visualisations.

Applications competition: this modality has selected one project in the web services category, awarded with €8,000, and two more in the General Category, which have received a first prize of €8,000 and a second prize of €5,000.

Category Web Services

General Category

  • First prize: Garbiñe AI by Beatriz Arenal Redondo. It is an intelligent assistant that combines Artificial Intelligence (AI) with open data from Open Data Euskadi to promote the circular economy and improve recycling rates in the Basque Country.
  • Second prize: Vitoria-Gasteiz Businessmap by Zaira Gil Ozaeta. It is an interactive visualisation tool based on open data, designed to improve strategic decisions in the field of entrepreneurship and economic activity in Vitoria-Gasteiz.

All these award-winning solutions reuse open datasets from the regional portal of Castilla y León or Euskadi, as the case may be. We encourage you to take a look at the proposals that may inspire you to participate in the next edition of these competitions. Follow us on social media so you don't miss out on this year's calls!

calendar icon
Blog

Open data can transform how we interact with our cities, offering opportunities to improve quality of life. When made publicly available, they enable the development of innovative applications and tools that address urban challenges, from accessibility to road safety and participation.

Real-time information can have positive impacts on citizens. For example, applications that use open data can suggest the most efficient routes, considering factors such as traffic and ongoing construction; information on the accessibility of public spaces can improve mobility for people with disabilities; data on cycling or pedestrian routes encourages greener and healthier modes of transport; and access to urban data can empower citizens to participate in decision-making about their city. In other words, citizen use of open data not only improves the efficiency of the city and its services, but also promotes a more inclusive, sustainable and participatory city.

To illustrate these ideas, this article discusses maps for "navigating" cities, made with open data. In other words, initiatives are shown that improve the relationship between citizens and their urban environment from different aspects such as accessibility, school safety and citizen participation. The first project is Mapcesible, which allows users to map and assess the accessibility of different locations in Spain. The second, Eskola BideApp, a mobile application designed to support safe school routes. And finally, maps that promote transparency and citizen participation in urban management. The first identifies noise pollution, the second locates available services in various areas within 15 minutes and the third displays banks in the city. These maps use a variety of public data sources to provide a detailed overview of different aspects of urban life.

The first initiative is a project of a large foundation, the second a collaborative and local proposal, and the third a personal project. Although they are based on very different approaches, all three have in common the use of public and open data and the vocation to help people understand and experience the city. The variety of origins of these projects indicates that the use of public and open data is not limited to large organisations.

Below is a summary of each project, followed by a comparison and reflection on the use of public and open data in urban environments.

Mapcesible, map for people with reduced mobility

Mapcessible was launched in 2019 to assess the accessibility of various spaces such as shops, public toilets, car parks, accommodation, restaurants, cultural spaces and natural environments.

Figure 1. Mapcesible. Source: https://mapcesible.fundaciontelefonica.com/intro

This project is supported by organizations such as the NGO Spanish Confederation of People with Physical and Organic Disabilities (COCEMFE) and the company ILUNION. It currently has more than 40,000 evaluated accessible spaces and thousands of users.

Figure 2. Mapcesible. Source: https://mapcesible.fundaciontelefonica.com/filters

Mapcesible uses open data as part of its operation. Specifically, the application incorporates fourteen datasets from official bodies, including the Ministry of Agriculture and Environment, city councils of different cities (including Madrid and Barcelona) and regional governments. This open data is combined with information provided by the users of the application, who can map and evaluate the accessibility of the places they visit. This combination of official data and citizen collaboration allows Mapcesible to provide up-to-date and detailed information on the accessibility of various spaces throughout Spain, thus benefiting people with reduced mobility.

Eskola BideAPP, application to define safe school routes.

Eskola BideAPP is an application developed by Montera34 - a team dedicated to data visualisation and the development of collaborative projects - in alliance with the Solasgune Association to support school pathways. Eskola BideAPP has served to ensure that boys and girls can access their schools safely and efficiently. The project mainly uses public data from the OpenStreetMap, e.g. geographical and cartographic data on streets, pavements, crossings, as well as data collected during the process of creating safe routes for children to walk to school in order to promote their autonomy and sustainable mobility.

The application offers an interactive dashboard to visualise the collected data, the generation of paper maps for sessions with students, and the creation of reports for municipal technicians. It uses technologies such as QGIS (a free and open source geographic information system) and a development environment for the R programming language, dedicated to statistical computing and graphics.

The project is divided into three main stages:

  1. Data collection through questionnaires in classrooms.
  2. Analysis and discussion of results with the children to co-design personalised routes.
  3. Testing of the designed routes.

Figure 3. Eskola BideaAPP. Photo by Julián Maguna (Solasgune). Source: https://montera34.com/project/eskola-bideapp/

Pablo Rey, one of the promoters of Montera34 together with Alfonso Sánchez, reports for this article that Eskola BideAPP, since 2019, has been used in eight municipalities, including Derio, Erandio, Galdakao, Gatika, Plentzia, Leioa, Sopela and Bilbao. However, it is currently only operational in the latter two. "The idea is to implement it in Portugalete at the beginning of 2025," he adds.

It''s worth noting the maps from Montera34 that illustrated the effect of Airbnb in San Sebastián and other cities, as well as the data analyses and maps published during the COVID-19 pandemic, which also visualized public data.In addition, Montera34 has used public data to analyse abstention, school segregation, minor contracts or make open data available to the public. For this last project, Montera34 has started with the ordinances of the Bilbao City Council and the minutes of its plenary sessions, so that they are not only available in a PDF document but also in the form of open and accessible data.

Mapas de Madrid sobre contaminación acústica, servicios y ubicación de bancos

Abel Vázquez Montoro has made several maps with open data that are very interesting, for example, the one made with data from the Strategic Noise Map (MER) offered by the Madrid City Council and land registry data. The map shows the noise affecting each building, facade and floor in Madrid.

Figure 4. Noise maps in Madrid. Source: https://madb.netlify.app/.

This map is organised as a dashboard with three sections: general data of the area visible on the map, dynamic 2D and 3D map with configurable options and detailed information on specific buildings. It is an open, free, non-commercial platform that uses free and open source software such as GitLab - a web-based Git repository management platform - and QGIS. The map allows the assessment of compliance with noise regulations and the impact on quality of life, as it also calculates the health risk associated with noise levels, using the attributable risk ratio (AR%).

15-minCity is another interactive map that visualises the concept of the "15-minute city" applied to different urban areas, i.e. it calculates how accessible different services are within a 15-minute walking or cycling radius from any point in the selected city.

Figure 5. 15-minCity. Source: https://whatif.sonycsl.it/15mincity/15min.php?idcity=9166

Finally, "Where to sit in Madrid" is another interactive map that shows the location of benches and other places to sit in public spaces in Madrid, highlighting the differences between rich (generally with more public seating) and poor (with less) neighbourhoods. This map uses the map-making tool, Felt, to visualise and share geospatial information in an accessible way. The map presents different types of seating, including traditional benches, individual seats, bleachers and other types of seating structures.

Figure 6. Where to sit in Madrid. Source: https://felt.com/map/Donde-sentarse-en-Madrid-TJx8NGCpRICRuiAR3R1WKC?loc=40.39689,-3.66392,13.97z

Its maps visualise public data on demographic information (e.g. population data by age, gender and nationality), urban information on land use, buildings and public spaces, socio-economic data (e.g. income, employment and other economic indicators for different districts and neighbourhoods), environmental data, including air quality, green spaces and other related aspects, and mobility data.

What do they have in common?

Name Promoter Type of data used  Profit motive Users Characteristics
Mapcesible Telefónica Foundation  Combines user-generated and public data (14 open data sets from government agencies) Non-profit         More than 5.000 Collaborative app, available on iOS and Android, more than 40,000 mapped accessible points.
Eskola BideAPP Montera34 and Solasgune Association Combines user-generated and public data (classroom questionnaires) and some public data.  Non-profit.                4.185 Focus on safe school routes, uses QGIS and R for data processing
Mapa Estratégico de Ruido (MER) Madrid City Council 2D and 3D geographic and visible area data Non-profit              No data  It allows the assessment of compliance with noise regulations and the impact on quality of life, as it also calculates the associated health risk.
15 min-City Sony GSL Geographic data and services Non-profit               No data  Interactive map visualising the concept of the "15-minute city" applied to different urban areas.
MAdB "Dónde sentarse en Madrid" Private Public data (demographic, electoral, urban, socio-economic, etc.) Non-profit                No data  Interactive maps of Madrid

Figure 7. Comparative table of solutions 

These projects share the approach of using open data to improve access to urban services, although they differ in their specific objectives and in the way information is collected and presented. Mapcesible, Eskola BideApp, MAdB and "Where to sit in Madrid" are of great value.

On the one hand, Mapcesible offers unified and updated information that allows people with disabilities to move around the city and access services. Eskola BideApp involves the community in the design and testing of safe routes for walking to school; this not only improves road safety, but also empowers young people to be active agents in urban planning. In the meantime, 15-min city, MER and the maps developed by Vázquez Montoro visualise complex data about Madrid so that citizens can better understand how their city works and how decisions that affect them are made.

Overall, the value of these projects lies in their ability to create a data culture, teaching how to value, interpret and use information to improve communities.


Content created by Miren Gutiérrez, PhD and researcher at the University of Deusto, expert in data activism, data justice, data literacy, and gender disinformation. The contents and viewpoints reflected in this publication are the sole responsibility of the author.​

calendar icon
Noticia

The Basque Government announces the fifth edition of the awards for the best open data reuse projects in the Basque Country. An initiative that was created with the aim of rewarding the best ideas and applications/services created from the open data catalogue of the Basque Country (Open Data Euskadi) to show its potential and promote open data culture.

As in previous editions, there are two types of prizes: an ideas competition and an applications competition. In the first, a sum of 13,500 euros in prize money will be awarded. In the second, the prize money amounts to €21,000.

Below are the details of the call for proposals for each of the modalities:

Competition of ideas

Proposals for services, studies, visualisations and applications (web and mobile) that reuse open datasets from the Open Data Euskadi portal to provide value to society will be evaluated. Ideas may be of general utility or focus on one of two sectors: health and social or environment and sustainability.

  • Who is it aimed at? To all those people or companies from inside and outside the Basque Country who wish to present ideas and projects for the reuse of open data from the Basque Country. This modality does not require technical knowledge of programming or computer development.
  • How can you take part? It will be necessary to explain the idea in a text document and attach it when registering. Registration can be done either online or in person.
  • What prizes are on offer?  Two winning projects will be chosen for each category, which will be divided into a first prize of 3,000 euros and a second prize of 1,500 euros. In other words, in summary, the awards are:
  • Health and social category
    • First prize: €3,000
    • Second prize: €1,500
  • Environment and sustainability category
    • First prize: €3,000
    • Second prize: €1,500
  • General category
    • First prize: €3,000
    • Second prize: €1,500

Here you can read the rules of the Open Data Euskadi ideas competition: https://www.euskadi.eus/servicios/1028505

Application competition

This modality does require some technical knowledge of programming or computer development, asalready developed solutions using Open Data Euskadi open datasets must bepresented . Applications may be submitted in the general category or in the specific category of web services.

  • Who is it aimed at? To those people or companies capable of creating services, studies, visualisations, web applications or applications for mobile devices that use at least one set of open data from one of the Basque Country's open data catalogues.
  • How can you take part? The project must be explained in a text document and the developed project (service, study, visualisation, web or mobile application) must be accessible via a URL. At registering both the explanatory document and the URL of the project will be attached. 
  • What prizes are on offer? This category offers a single prize of 8,000 euros for the web services category and two prizes for the general category of 8,000 and 5,000 euros.
    • Category web services

      • Only one prize: €8,000
    •  General category
      • First prize: €8,000
      • Second prize: €5,000

Check here the rules of the competition in application development mode: https://www.euskadi.eus/servicios/1028605

Deadline for registration:

The competition has been accepting proposals since 31 July and closes on 10 October. Follow us on social media so you don't miss any news about open data reuse events and competitions: @datosgob

Take part!

calendar icon
Documentación

The digital revolution is transforming municipal services, driven by the increasing adoption of artificial intelligence (AI) technologies that also benefit from open data. These developments have the potential to redefine the way municipalities deliver services to their citizens, providing tools to improve efficiency, accessibility and sustainability. This report looks at success stories in the deployment of applications and platforms that seek to improve various aspects of life in municipalities, highlighting their potential to unlock more of the vast untapped potential of open data and associated artificial intelligence technologies.

The applications and platforms described in this report have a high potential for replicability in different municipal contexts, as they address common problems. Replication of these solutions can take place through collaboration between municipalities, companies and developers, as well as through the release and standardisation of open data.

Despite the benefits, the adoption of open data for municipal innovation also presents significant challenges. The quality, updating and standardisation of data published by local authorities, as well as interoperability between different platforms and systems, must be ensured. In addition, the open data culture needs to be reinforced among all actors involved, including citizens, developers, businesses and public administrations themselves.

The use cases analysed are divided into four sections. Each of these sections is described below and some examples of the solutions included in the report are shown.

Transport and Mobility

One of the most significant challenges in urban areas is transport and mobility management. Applications using open data have proven to be effective in improving these services. For example, applications such as Park4Dis make it easy to locate parking spaces for people with reduced mobility, using data from multiple municipalities and contributions from volunteers. CityMapper, which has gone global, on the other hand, offers optimised public transport routes in real time, integrating data from various transport modes to provide the most efficient route. These applications not only improve mobility, but also contribute to sustainability by reducing congestion and carbon emissions.

Environment and Sustainability

Growing awareness of sustainability has spurred the development of applications that promote environmentally friendly practices. CleanSpot, for example, facilitates the location of recycling points and the management of municipal waste. The application encourages citizen participation in cleaning and recycling, contributing to the reduction of the ecological footprint. Liight gamifies sustainable behaviour by rewarding users for actions such as recycling or using public transport. These applications not only improve environmental management, but also educate and motivate citizens to adopt more sustainable habits.

Optimisation of Basic Public Services

Urban service management platforms, such as Gestdropper, use open data to monitor and control urban infrastructure in real time. These tools enable more efficient management of resources such as street lighting, water networks and street furniture, optimising maintenance, incident response and reducing operating costs. Moreover, the deployment of appointment management systems, such as CitaME, helps to reduce waiting times and improve efficiency in customer service.

Citizen Services Aggregators

Applications that centralise public information and services, such as Badajoz Es Más and AppValencia, improve accessibility and communication between administrations and citizens. These platforms provide real-time data on public transport, cultural events, tourism and administrative procedures, making life in the municipality easier for residents and tourists alike. For example, integrating multiple services into a single application improves efficiency and reduces the need for unnecessary travel. These tools also support local economies by promoting cultural events and commercial services.

Conclusions

The use of open data and artificial intelligence technologies is transforming municipal management, improving the efficiency, accessibility and sustainability of public services. The success stories presented in this report describe how these tools can benefit both citizens and public administrations by making cities smarter, more inclusive and sustainable environments, and more responsive to the needs and well-being of their inhabitants and visitors.

Download here the accesible version of the report.

Listen the podcast (only availible in spanish)

Podcast video

calendar icon
Application

ELISA: The Plan in figures is a tool launched by the Spanish government to visualise updated data on the implementation of the investments of the Recovery, Transformation and Resilience Plan (PRTR). Through intuitive visualisations, this tool provides information on the number of companies and households that have received funding, the size of the beneficiary companies and the investments made in the different levers of action defined in the Plan.

The tool also provides details of the funds managed and executed in each Autonomous Community. In this way, the territorial distribution of the projects can be seen. In addition, the tool is accompanied by territorial sheets, which show a more qualitative detail of the impact of the Recovery Plan in each Autonomous Community.

calendar icon