AI to improve public tenders and sculptures using open data: we spoke with the Spanish semi-finalists of the EU Datathon 2022

Fecha de la noticia: 27-12-2022

Imagen participantes Eu Datathon 2022

After several months competing, on 20 October the open data contest organised by the EU came to an end. The EU Datathon is a contest that gives data developers and scientists the chance to demonstrate, through their creativity, the potential of open data.

Although in this post you can find out in detail about the winning projects, in this case we would like to highlight the participation of two Spanish developers whose initiatives were chosen as semi-finalists from amongst the 156 proposals submitted at the start.

In an edition that broke the attendance records, both in terms of the number of contestants and the countries of origin, Antonio Moneo and Manuel Jose García represented Spain with two projects which stood out for their innovative nature regarding the reuse of open data.

Using Artificial Intelligence to optimally solve public tenders

Manuel José García has a PhD in Telecommunications Engineering from the University of Oviedo and currently works as a data scientist at the technology consultancy NTT Data. After scooping up first prize in the Euskadi Open Data contest in 2020, García decided to take part in the European hackathon, making the most of what he had learned from the research carried out in his PhD thesis and which gave rise to the project 'Detection of irregular tenders in Spain through big data analytics and artificial intelligence'.

"It is an initiative that uses Big Data and Artificial Intelligence (AI) to analyse the data from public tenders and to automatically recommend those companies that can best undertake the tender. With this in mind, a search engine is created for companies that can carry out a tender and a form is filled out describing the details that characterise the public tender. From that point onwards, the programme seeks the most suitable companies to carry out the project”, describes Manuel Jose García, who adds that the list of companies recommended by tender is achieved thanks to the fact that the AI model has been trained with the history of hundreds of thousands of tenders and winning companies from the past, learning what type of companies win tenders and what characteristics they have.

An essential requirement in order to participate in the European datathon is to use information from the data catalogues that both Europe and Spain make available to the public at a national, regional and local level. In the specific case of Manuel José García, his project has been developed using the public tender data available at the Public Procurement Platforms.

“The project has been developed using two types of data sources. On the one hand, the public and free data of the tenders and, on the other, the business data required to search for and characterise the companies present in the search engine.  In particular, the annual accounts that companies must submit to the Registrar of Companies have been used. These data are public but paid and it is vogue to make them free as they are data managed by a public entity”, comments the data scientist.

In fact, it is precisely this point related with the data from the Registrar of Companies which entailed a challenge to take the project forward: “Getting structured data from public tenders is complicated, since the open data format of the Spanish Procurement Platform is difficult to handle. In addition, it is necessary to carry out thorough cleaning of the data because their quality is low”, he points out.

In his opinion, if public administrations wish to promote the reuse of open data, “They must promote the culture of data. In other words, to be aware of the importance of the data they handle and store and, in turn, be proactive to exploit said data and make them available to third parties”.

Architecture and open data to make the Sustainable Development Goals visible

In addition to being the Director of Change Management and Advanced Analysis at BBVA, Antonio Moneo was also a semi-finalist in the latest edition of the European datathon thanks to a project that merges art with the dissemination of open data.

Tangible Data is an initiative whose goal is to convert emblematic data series into physical sculptures and thereby be able to lend visibility to issues such as climate change, inequality or the transparency of our governments. Against a backdrop of excess information and a growing digital divide, it is essential to rely on the physical environment to explain what is happening in the world", explains Antonio Moneo and he stresses that "representing data in a sculpture allows us to present a challenge from an objective, respectful perspective”.

When selecting the open data sets, Moneo was clear that he wanted to make visible the realities and statistics related with the Sustainable Development Objectives so that his project would fulfil the social purpose that led him to participate in the datathon: “We use open data from reliable sources and properly licensed with Creative Commons or MIT criteria. Sometimes we have used a private data source, but our objective is to enhance the information that is already available. In addition, we usually use the data in the manner in which they are published and only apply some transformations such as the smoothing of the curves with moving averages that allow us to make the sculptures more pleasant to the touch and, self-evidently, techniques to create volumes in three dimensions that are the basis for the sculptures," he observed.

Hence, to carry out Tangible Data it has been necessary, on the one hand, to build a physical structure and, on the other, to make it invite the user to move to the digital realm where, at the end of the day, the information they seek to make visible can be found. “The first step is to design a 3D model in virtual format which we send off to be produced locally, using the FabLabs network. Later, we include a QR code in the sculpture that allows the audience to know in depth the meaning of the data that it reflects”, explained the promoter of the project.

Developing such an ambitious project, both from a physical point of view and from an informative perspective, is no easy task. The thing is, it is not only a matter of circumventing the design of the sculpture as such, but also of finding the necessary data to transfer the reality that it is sought to represent: “Comparability is one of the biggest challenges we have encountered because, in many cases, the most relevant data for measuring the environment are not always comparable. Sometimes we find data at a regional level, sometimes at a national or local level, but it is not always possible to find all the information you need. This is why, in order to solve this challenge, we have invested more time in searching for data and, in many cases, the initial idea about a sculpture has been modified because we could not find data of sufficient quality”, he went on to say.

Moneo also commented that the other major difficulty that has marked the development of the project has been to access updated data. “Updating is always a critical issue, but right now it is even more so. The consequences of COVID, the war in Ukraine and the current energy crisis paint a very different world from the one we encountered in 2015, when the sustainable development objectives were signed. For example, it is estimated that as a consequence of the pandemic, between 70 and 150 million people will fall into the extreme poverty segment (with less than 1.9 dollars a day). This change, bucking the trend of the last three decades, is not reflected in the World Bank statistics yet as they only go up to 2019. It would this seem that that very relevant data reflect a distorted reality”, he concluded.

A positive review of his time in the EU Datathon

Despite not having reached the final that would have allowed them to compete for part of the total prize, which amounted to 200,000 euros, the two participants agreed that their experience in the datathon has been more than positive. So, whilst Manuel José García believes that "the European Commission must continue to commit to these initiatives so that people are aware of the value of data and the challenges that they can solve”, for his part, Antonio Moneo points out that “this type of action motivates the agencies that drive the data and those who are developing to improve the impact of data on society”.

What's more, both participants have managed to stimulate their professional curiosity thanks to this challenge, whilst simultaneously testing the potential and quality of their respective work vis-à-vis European data experts.