Medio ambiente | datos.gob.es

Application

Tupreciodeluz.com offers daily information on the price of electricity, showing the cheapest price of the day, as well as the average price of the last 24 hours and the most expensive time slot. The aim is to allow consumers adhering to the regulated market tariff (PVPC) to modulate their consumption in order to save on their electricity bill.

The website also features an artificial intelligence assistant for energy advice, and measures to promote efficiency and responsible energy consumption.

In addition, consumers can use a solar sizer to find out the feasibility of installing solar energy in their home or business.

The website also has a blog where the most relevant news for consumers is published in a summarised and entertaining way.

28/02/2025

Podcast: High value data with geospatial component

Entrevista

In this episode we will delve into the importance of three related categories of high-value datasets. These are Earth observation and environmental data, geospatial data and mobility data. To tell us about them, we have interviewed two experts in the field:

Paloma Abad Power, Deputy Director of the National Centre for Geographic Information (CNIG).
Rafael Martínez Cebolla, geographer of the Government of Aragón.

With them we have explored how these high-value datasets are transforming our environment, contributing to sustainable development and technological innovation.

Listen to the full podcast (only available in Spanish)

Summary of the interview

1. What are high-value datasets and why are their important?

Paloma Abad Power: According to the regulation, high-value datasets are those that ensure highest socio-economic potential and, for this, they must be easy to find, i.e. they must be accessible, interoperable and usable. And what does this mean? That means that the datasets must have their descriptions, i.e. the online metadata, which report the statistics and their properties, and which can be easily downloaded or used.

In many cases, these data are often reference data, i.e. data that serve to generate other types of data, such as thematic data, or can generate added value.

Rafael Martínez Cebolla: They could be defined as those datasets that represent phenomena that are useful for decision making, for any public policy or for any action that a natural or legal person may undertake.

In this sense, there are already some directives, which are not so recent, such as the Water Framework Directive or the INSPIRE Directive, which motivated this need to provide shared data under standards that drive the sustainable development of our society.

2. These high-value data are defined by a European Directive and an Implementing Regulation which dictated six categories of high-value datasets. On this occasion we will focus on three of them: Earth observation and environmental data, geospatial data and mobility data. What do these three categories of data have in common and what specific datasets do they cover?

Paloma Abad Power: In my opinion, these data have in common the geographical component, i.e. they are data located on the ground and therefore serve to solve problems of different nature and linked to society.

Thus, for example, we have, with national coverage, the National Aerial Orthophotography Plan (PNOA), which are the aerial images, the System of Land Occupation Information (SIOSE), cadastral parcels, boundary lines, geographical names, roads, postal addresses, protected sites - which can be both environmental and also castles, i.e. historical heritage- etc. And these categories cover almost all the themes defined by the annexes of the INSPIRE directive.

Rafael Martínez Cebolla: It is necessary to know what is pure geographic information, with a direct geographic reference, as opposed to other types of phenomena that have indirect geographic references. In today's world, 90% of information can be located, either directly or indirectly. Today more than ever, geographic tagging is mandatory for any corporation that wants to implement a certain activity, be it social, cultural, environmental or economic: the implementation of renewable energies, where I am going to eat today, etc. These high-value datasets enhance these geographical references, especially of an indirect nature, which help us to make a decision.

3. Which agencies publish these high-value datasets? In other words, where could a user locate datasets in these categories?

Paloma Abad Power: It is necessary to highlight the role of the National Cartographic System, which is an action model in which the organisations of the NSA (National State Administration) and the autonomous communities participate. It is coordinating the co-production of many unique products, funded by these organisations.

These products are published through interoperable web services. They are published, in this case, by the National Center for Geographic Information (CNIG), which is also responsible for much of the metadata for these products.

They could be located through the Catalogues of the IDEE (Spatial Data Infrastructure of Spain) or the Official Catalogue of INSPIRE Data and Services, which is also included in datos.gob.es and the European Data Portal.

And who can publish? All bodies that have a legal mandate for a product classified under the Regulation. Examples: all the mapping bodies of the Autonomous Communities, the General Directorate of Cadastre, Historical Heritage, the National Statistics Institute, the Geological and Mining Institute (IGME), the Hydrographic Institute of the Navy, the Ministry of Agriculture, Fisheries and Food (MAPA), the Ministry for Ecological Transition and the Demographic Challenge, etc. There are a multitude of organisations and many of them, as I have mentioned, participate in the National Cartographic System, provide the data and generate a single service for the citizen.

Rafael Martínez Cebolla: The National Cartographic System defines very well the degree of competences assumed by the administrations. In other words, the public administration at all levels provides official data, assisted by private enterprise, sometimes through public procurement.

The General State Administration goes up to scales of 1:25,000 in the case of the National Geographic Institute (IGN) and then the distribution of competencies for the rest of the scales is for the autonomous or local administrations. In addition, there are a number of actors, such as hydrographic confederations, state departments or the Cadastre, which have under their competences the legal obligation to generate these datasets.

For me it is an example of how it should be distributed, although it is true that it is then necessary to coordinate very well, through collegiate bodies, so that the cartographic production is well integrated.

Paloma Abad Power: There are also collaborative projects, such as, for example, a citizen map, technically known as an X, Y, Z map, which consists of capturing the mapping of all organisations at national and local level. That is, from small scales 1:1,000,000 or 1:50,000,000 to very large scales, such as 1:1000, to provide the citizen with a single multi-scale map that can be served through interoperable and standardised web services.

4. Do you have any other examples of direct application of this type of data?

Rafael Martínez Cebolla: A clear example was seen with the pandemic, with the mobility data published by the National Institute of Statistics. These were very useful data for the administration, for decision making, and from which we have to learn much more for the management of future pandemics and crises, including economic crises. We need to learn and develop our early warning systems.

I believe that this is the line of work: data that is useful for the general public. That is why I say that mobility has been a clear example, because it was the citizen himself who was informing the administration about how he was moving.

Paloma Abad Power: I am going to contribute some data. For example, according to statistics from the National Cartographic System services, the most demanded data are aerial images and digital terrain models. In 2022 there were 8 million requests and in 2023 there were 19 million requests for orthoimages alone.

Rafael Martínez Cebolla: I would like to add that this increase is also because things are being done well. On the one hand, discovery systems are improved. My general feeling is that there are many successful example projects, both from the administration itself and from companies that need this basic information to generate their products.

There was an application that was generated very quickly with de-escalation - you went to a website and it told you how far you could walk through your municipality - because people wanted to get out and walk. This example arises from spatial data that have moved out of the public administration. I believe that this is the importance of successful examples, which come from people who see a compelling need.

5. And how do you incentivise such re-use?

Rafael Martínez Cebolla: I have countless examples. Incentivisation also involves promotion and marketing, something that has sometimes failed us in the public administration. You stick to certain competences and it seems that just putting it on a website is enough. And that is not all.

We are incentivising re-use in two ways. On the one hand, internally, within the administration itself, teaching them that geographic information is useful for planning and evaluating public policies. And I give you the example of the Public Health Atlas of the Government of Aragon, awarded by an Iberian society of epidemiology the year before the pandemic. It was useful for them to know what the health of the Aragonese was like and what preventive measures they had to take.

As for the external incentives, in the case of the Geographic Institute of Aragon, it was seen that the profile entering the geoportal was very technical. The formats used were also very technical, which meant that the general public was not reached. To solve this problem, we promoted portals such as the IDE didactica, a portal for teaching geography, which reaches any citizen who wants to learn about the territory of Aragon.

Paloma Abad Power: I would like to highlight the economic benefit of this, as was shown, for example, in the economic study carried out by the National Centre for Graphic Information with the University of Leuven to measure the economic benefit of the Spatial Data Infrastructure of Spain. It measure the benefit of private companies using free and open services, rather than using, for example, Google Maps or other non-open sources..

Rafael Martínez Cebolla: For better and for worse, because the quality of the official data sometimes we wish it were better. Both Paloma in the General State Administration and I in the regional administration sometimes know that there are official data where more money needs to be invested so that the quality of the data would be better and could be reusable.

But it is true that these studies are key to know in which dimension high-value datasets move. That is to say, having studies that report on the real benefit of having a spatial data infrastructure at state or regional level is, for me, key for two things: for the citizen to understand its importance and, above all, for the politician who arrives every N years to understand the evolution of these platforms and the revolution in geospatial information that we have experienced in the last 20 years.

6. The Geographic Institute of Aragon has also produced a report on the advantages of reusing this type of data, is that right?

Rafael Martínez Cebolla: Yes, it was published earlier this year. We have been doing this report internally for three or four years, because we knew we were going to make the leap to a spatial knowledge infrastructure and we wanted to see the impact of implementing a knowledge graph within the data infrastructure. The Geographic Institute of Aragon has made an effort in recent years to analyse the economic benefit of having this infrastructure available for the citizens themselves, not for the administration. In other words, how much money Aragonese citizens save in their taxes by having this infrastructure. Today we know that having a geographic information platform saves approximately 2 million euros a year for the citizens of Aragon.

I would like to see the report for the next January or February, because I think the leap will be significant. The knowledge graph was implemented in April last year and this gap will be felt in the year ahead. We have noticed a significant increase in requests, both for viewing and downloading.

Basically from one year to the next, we have almost doubled both the number of accesses and downloads. This affects the technological component: you have to redesign it. More people are discovering you, more people are accessing your data and, therefore, you have to dedicate more investment to the technological component, because it is being the bottleneck.

7. What do you see as the challenges to be faced in the coming years?

Paloma Abad Power: In my opinion, the first challenge is to get to know the user in order to provide a better service. The technical user, the university students, the users on the street, etc. We are thinking of doing a survey when the user is going to use our geographic information. But of course, such surveys sometimes slow down the use of geographic information. That is the great challenge: to know the user in order to make services more user-friendly, applications, etc. and to know how to get to what they want and give it to them better.

There is also another technical challenge. When the spatial infrastructures began, the technical level was very high, you had to know what a visualisation service was, the metadata, know the parameters, etc. This has to be eliminated, the user can simply say I want, for example, to consult and visualise the length of the Ebro river, in a more user-friendly way. Or for example the word LiDAR, which was the Italian digital model with high accuracy. All these terms need to be made much more user-friendly.

Rafael Martínez Cebolla: Above all, let them be discovered. My perception is that we must continue to promote the discovery of spatial data without having to explain to the untrained user, or even to some technicians, that we must have a data, a metadata, a service..... No, no. Basically it is that generalist search engines know how to find high-value datasets without knowing that there is such a thing as spatial data infrastructure.

It is a matter of publishing the data under friendly standards, under accessible versions and, above all, publishing them in permanent URIs, which are not going to change. In other words, the data will improve in quality, but will never change.

And above all, from a technical point of view, both spatial data infrastructures and geoportals and knowledge infrastructures have to ensure that high-value information nodes are related to each other from a semantic and geographical point of view. I understand that knowledge networks will help in this regard. In other words, mobility has to be related to the observation of the territory, to public health data or to statistical data, which also have a geographical component. This geographical semantic relationship is key for me.

Subscribe to our Soundcloud profile to keep up to date with our podcasts

Interview clips

Clip 1. What are high-value datasets and why are their important?

Clip 2. Where can a user locate geographic data?

Clip 3. How is the reuse of data with a geographic component being encouraged?

23/10/2024

Digital twins: advantages and use cases in Spain

Blog

A digital twin is a virtual, interactive representation of a real-world object, system or process. We are talking, for example, about a digital replica of a factory, a city or even a human body. These virtual models allow simulating, analysing and predicting the behaviour of the original element, which is key for optimisation and maintenance in real time.

Due to their functionalities, digital twins are being used in various sectors such as health, transport or agriculture. In this article, we review the benefits of their use and show two examples related to open data.

Advantages of digital twins

Digital twins use real data sources from the environment, obtained through sensors and open platforms, among others. As a result, the digital twins are updated in real time to reflect reality, which brings a number of advantages:

Increased performance: one of the main differences with traditional simulations is that digital twins use real-time data for modelling, allowing better decisions to be made to optimise equipment and system performance according to the needs of the moment.
Improved planning: using technologies based on artificial intelligence (AI) and machine learning, the digital twin can analyse performance issues or perform virtual "what-if" simulations. In this way, failures and problems can be predicted before they occur, enabling proactive maintenance.
Cost reduction: improved data management thanks to a digital twin generates benefits equivalent to 25% of total infrastructure expenditure. In addition, by avoiding costly failures and optimizing processes, operating costs can be significantly reduced. They also enable remote monitoring and control of systems from anywhere, improving efficiency by centralizing operations.
Customization and flexibility: by creating detailed virtual models of products or processes, organizations can quickly adapt their operations to meet changing environmental demands and individual customer/citizen preferences. For example, in manufacturing, digital twins enable customized mass production, adjusting production lines in real time to create unique products according to customer specifications. On the other hand, in healthcare, digital twins can model the human body to customize medical treatments, thereby improving efficacy and reducing side effects.
Boosting experimentation and innovation: digital twins provide a safe and controlled environment for testing new ideas and solutions, without the risks and costs associated with physical experiments. Among other issues, they allow experimentation with large objects or projects that, due to their size, do not usually lend themselves to real-life experimentation.
Improved sustainability: by enabling simulation and detailed analysis of processes and systems, organizations can identify areas of inefficiency and waste, thus optimizing the use of resources. For example, digital twins can model energy consumption and production in real time, enabling precise adjustments that reduce consumption and carbon emissions.

Examples of digital twins in Spain

The following three examples illustrate these advantages.

GeDIA project: artificial intelligence to predict changes in territories

GeDIA is a tool for strategic planning of smart cities, which allows scenario simulations. It uses artificial intelligence models based on existing data sources and tools in the territory.

The scope of the tool is very broad, but its creators highlight two use cases:

Future infrastructure needs: the platform performs detailed analyses considering trends, thanks to artificial intelligence models. In this way, growth projections can be made and the needs for infrastructures and services, such as energy and water, can be planned in specific areas of a territory, guaranteeing their availability.
Growth and tourism: GeDIA is also used to study and analyse urban and tourism growth in specific areas. The tool identifies patterns of gentrification and assesses their impact on the local population, using census data. In this way, demographic changes and their impact, such as housing needs, can be better understood and decisions can be made to facilitate equitable and sustainable growth.

This initiative has the participation of various companies and the University of Malaga (UMA), as well as the financial backing of Red.es and the European Union.

Digital twin of the Mar Menor: data to protect the environment

The Mar Menor, the salt lagoon of the Region of Murcia, has suffered serious ecological problems in recent years, influenced by agricultural pressure, tourism and urbanisation.

To better understand the causes and assess possible solutions, TRAGSATEC, a state-owned environmental protection agency, developed a digital twin. It mapped a surrounding area of more than 1,600 square kilometres, known as the Campo de Cartagena Region. In total, 51,000 nadir images, 200,000 oblique images and more than four terabytes of LiDAR data were obtained.

Thanks to this digital twin, TRAGSATEC has been able to simulate various flooding scenarios and the impact of installing containment elements or obstacles, such as a wall, to redirect the flow of water. They have also been able to study the distance between the soil and the groundwater, to determine the impact of fertiliser seepage, among other issues.

Challenges and the way forward

These are just two examples, but they highlight the potential of an increasingly popular technology. However, for its implementation to be even greater, some challenges need to be addressed, such as initial costs, both in technology and training, or security, by increasing the attack surface. Another challenge is the interoperability problems that arise when different public administrations establish digital twins and local data spaces. To address this issue further, the European Commission has published a guide that helps to identify the main organisational and cultural challenges to interoperability, offering good practices to overcome them.

In short, digital twins offer numerous advantages, such as improved performance or cost reduction. These benefits are driving their adoption in various industries and it is likely that, as current challenges are overcome, digital twins will become an essential tool for optimising processes and improving operational efficiency in an increasingly digitised world.

15/10/2024

How Google, Moovit and Citymapper reuse open mobility data

Blog

Many people use apps to get around in their daily lives. Apps such as Google Maps, Moovit or CityMapper provide the fastest and most efficient route to a destination. However, what many users are unaware of is that behind these platforms lies a valuable source of information: open data. By reusing public datasets, such as those related to air quality, traffic or public transport, these applications can provide a better service.

In this post, we will explore how the reuse of open data by these platforms empowers a smarter and more sustainable urban ecosystem .

Google Maps: aggregates air quality information and transport data into GTFS.

More than a billion people use Google Maps every month around the world. The tech giant offers a free, up-to-date world map that draws its data from a variety of sources, some of them open.

One of the functions provided by the app is information about the air quality in the user's location. The Air Quality Index (AQI) is a parameter that is determined by each country or region. The European benchmark can be consulted on this map which shows air quality by geolocated zones in real time.

To display the air quality of the user's location, Google Maps applies a model based on a multi-layered approach known as the "fusion approach". This method combines data from several input sources and weights the layers with a sophisticated procedure. The input layers are:

Government reference monitoring stations
Commercial sensor networks
Global and regional dispersion models
Dust and smoke fire models
Satellite information
Traffic data
Ancillary information such as surface area
Meteorology

In the case of Spain, this information is obtained from open data sources such as the Ministry of Ecological Transition and Demographic Challenge, the Regional Ministry of Environment, Territory and Housing of the Xunta de Galicia or the Community of Madrid. Open data sources used in other countries around the worldcan be found here .

Another functionality offered by Google Maps to plan the best routes to reach a destination is the information on public transport. These data are provided on a voluntary basis by the public companies providing transport services in each city. In order to make this open data available to the user, it is first dumped into Google Transit and must comply with the open public transport standard GTFS (General Transit Feed Specification).

Moovit: reusing open data to deliver real-time information

Moovit is another urban mobility app most used by Spaniards, which uses open and collaborative data to make it easier for users to plan their journeys by public transport.

Since its launch in 2012, the free-to-download app offers real-time information on the different transport options, suggests the best routes to reach the indicated destination, guides users during their journey (how long they have to wait, how many stops are left, when they have to get off, etc.) and provides constant updates in the event of any alteration in the service.

Like other mobility apps , it is also available in offline mode and allows you to save routes and frequent lines in "Favourites". It is also an inclusive solution as it integrates VoiceOver (iOs) or TalkBack (Android) for blind people.

The platform not only leverages open data provided by governments and local authorities, but also collects information from its users, allowing it to offer a dynamic and constantly updated service.

CityMapper: born as a reuser of open mobility data

The CityMapper development team recognises that the application was born with an open DNA that still remains. They reuse open datasets from, for example, OpenStreetMap at global level or RENFE and Cercanías Bilbao at national level. As the application becomes available in more cities, the list of open data reference sources from which it draws information grows.

The platform offers real-time information on public transport routes, including bus, train, metro and bike sharing. It also adds options for walking, cycling and ridesharing. It is designed to provide the most efficient and fastest route to a destinationby integrating data from different modes of transport into a single interface.

As we published in the monographic report "Municipal Innovation through Open Data" CityMapper mainly uses open data from local transport authorities, typically using the GTFS (General Transit Feed Specification) standard . However, when this data is not sufficient or accurate enough, CityMapper combines it with datasets generated by the application's own users who voluntarily collaborate. It also uses data enhanced and managed by the work of the company's own local employees. All this data is combined with artificial intelligence algorithms developed to optimise routes and provide recommendations tailored to users' needs.

In conclusion, the use of open data in transport is driving a significant transformation in the mobility sector in cities. Through their contribution to applications, users can access up-to-date and accurate data, plan their journeys efficiently and make informed decisions. Governments, for their part, have taken on the role of facilitators by enabling the dissemination of data through open platforms, optimising resources and fostering collaboration across sectors. In addition, open data has created new opportunities for developers and the private sector, who have contributed with technological solutions such as Google Maps, Moovit or CityMapper. Ultimately, the potential of open data to transform the future of urban mobility is undeniable.

09/10/2024

Citizen science projects that encourage public knowledge

Blog

Citizen science is consolidating itself as one of the most relevant sources of most relevant sources of reference in contemporary research contemporary research. This is recognised by the Centro Superior de Investigaciones Científicas (CSIC), which defines citizen science as a methodology and a means for the promotion of scientific culture in which science and citizen participation strategies converge.

We talked some time ago about the importance importance of citizen science in society in society. Today, citizen science projects have not only increased in number, diversity and complexity, but have also driven a significant process of reflection on how citizens can actively contribute to the generation of data and knowledge.

To reach this point, programmes such as Horizon 2020, which explicitly recognised citizen participation in science, have played a key role. More specifically, the chapter "Science with and for society"gave an important boost to this type of initiatives in Europe and also in Spain. In fact, as a result of Spanish participation in this programme, as well as in parallel initiatives, Spanish projects have been increasing in size and connections with international initiatives.

This growing interest in citizen science also translates into concrete policies. An example of this is the current Spanish Strategy for Science, Technology and Innovation (EECTI), for the period 2021-2027, which includes "the social and economic responsibility of R&D&I through the incorporation of citizen science" which includes "the social and economic responsibility of I through the incorporation of citizen science".

In short, we commented some time agoin short, citizen science initiatives seek to encourage a more democratic sciencethat responds to the interests of all citizens and generates information that can be reused for the benefit of society. Here are some examples of citizen science projects that help collect data whose reuse can have a positive impact on society:

AtmOOs Academic Project: Education and citizen science on air pollution and mobility.

In this programme, Thigis developed a citizen science pilot on mobility and the environment with pupils from a school in Barcelona's Eixample district. This project, which is already replicable in other schoolsconsists of collecting data on student mobility patterns in order to analyse issues related to sustainability.

On the website of AtmOOs Academic you can visualise the results of all the editions that have been carried out annually since the 2017-2018 academic year and show information on the vehicles used by students to go to class or the emissions generated according to school stage.

WildINTEL: Research project on life monitoring in Huelva

The University of Huelva and the State Agency for Scientific Research (CSIC) are collaborating to build a wildlife monitoring system to obtain essential biodiversity variables. To do this, remote data capture photo-trapping cameras and artificial intelligence are used.

The wildINTEL project project focuses on the development of a monitoring system that is scalable and replicable, thus facilitating the efficient collection and management of biodiversity data. This system will incorporate innovative technologies to provide accurate and objective demographic estimates of populations and communities.

Through this project which started in December 2023 and will continue until December 2026, it is expected to provide tools and products to improve the management of biodiversity not only in the province of Huelva but throughout Europe.

IncluScience-Me: Citizen science in the classroom to promote scientific culture and biodiversity conservation.

This citizen science project combining education and biodiversity arises from the need to address scientific research in schools. To do this, students take on the role of a researcher to tackle a real challenge: to track and identify the mammals that live in their immediate environment to help update a distribution map and, therefore, their conservation.

IncluScience-Me was born at the University of Cordoba and, specifically, in the Research Group on Education and Biodiversity Management (Gesbio), and has been made possible thanks to the participation of the University of Castilla-La Mancha and the Research Institute for Hunting Resources of Ciudad Real (IREC), with the collaboration of the Spanish Foundation for Science and Technology - Ministry of Science, Innovation and Universities.

The Memory of the Herd: Documentary corpus of pastoral life.

This citizen science project which has been active since July 2023, aims to gather knowledge and experiences from sheperds and retired shepherds about herd management and livestock farming.

The entity responsible for the programme is the Institut Català de Paleoecologia Humana i Evolució Social, although the Museu Etnogràfic de Ripoll, Institució Milà i Fontanals-CSIC, Universitat Autònoma de Barcelona and Universitat Rovira i Virgili also collaborate.

Through the programme, it helps to interpret the archaeological record and contributes to the preservation of knowledge of pastoral practice. In addition, it values the experience and knowledge of older people, a work that contributes to ending the negative connotation of "old age" in a society that gives priority to "youth", i.e., that they are no longer considered passive subjects but active social subjects.

Plastic Pirates Spain: Study of plastic pollution in European rivers.

It is a citizen science project which has been carried out over the last year with young people between 12 and 18 years of age in the communities of Castilla y León and Catalonia aims to contribute to generating scientific evidence and environmental awareness about plastic waste in rivers.

To this end, groups of young people from different educational centres, associations and youth groups have taken part in sampling campaigns to collect data on the presence of waste and rubbish, mainly plastics and microplastics in riverbanks and water.

In Spain, this project has been coordinated by the BETA Technology Centre of the University of Vic - Central University of Catalonia together with the University of Burgos and the Oxygen Foundation. You can access more information on their website.

Here are some examples of citizen science projects. You can find out more at the Observatory of Citizen Science in Spain an initiative that brings together a wide range of educational resources, reports and other interesting information on citizen science and its impact in Spain. do you know of any other projects? Send it to us at dinamizacion@datos.gob.es and we can publicise it through our dissemination channels.

18/09/2024

Discovering the Digital Product Passport (DPP) and CIRPASS: A Look into the Future of the Circular Economy

Blog

Digital transformation has reached almost every aspect and sector of our lives, and the world of products and services is no exception. In this context, the Digital Product Passport (DPP) concept is emerging as a revolutionary tool to foster sustainability and the circular economy. Accompanied by initiatives such as CIRPASS (Circular Product Information System for Sustainability), the DPP promises to change the way we interact with products throughout their life cycle. In this article, we will explore what DPP is, its origins, applications, risks and how it can affect our daily lives and the protection of our personal data.

What is the Digital Product Passport (DPP)? Origin and importance

The Digital Product Passport is a digital collection of key information about a product, from manufacturing to recycling. This passport allows products to be tracked and managed more efficiently, improving transparency and facilitating sustainable practices. The information contained in a DPP may include details on the materials used, the manufacturing process, the supply chain, instructions for use and how to recycle the product at the end of its life.

The DPP has been developed in response to the growing need to promote the circular economy and reduce the environmental impact of products. The European Union (EU) has been a pioneer in promoting policies and regulations that support sustainability. Initiatives such as the EU's Circular Economy Action Plan have been instrumental in driving the DPP forward. The objectives of this plan are as follows:

Greater Transparency: Consumers no longer have to guess about the origin of their products and how to dispose of them correctly. With a machine-readable DPP (e.g. QR code or NFC tag) attached to end products, consumers can make informed purchasing decisions and brands can eliminate greenwashing with confidence.
Simplified Compliance: By creating an audit of events and transactions in a product's value chain, the DPP provides the brand and its suppliers with the necessary data to address compliance demands efficiently.
Sustainable Production: By tracking and reporting the social and environmental impacts of a product from source to disposal, brands can make data-driven decisions to optimise sustainability in product development.
Circular Economy: The DPP facilitates a circular economy by promoting eco-design and the responsible production of durable products that can be reused, remanufactured and disposed of correctly.

The following image summarises the main advantages of the digital passport at each stage of the digital product manufacturing process:

CIRPASS as a facilitator of DPP implementation

CIRPASS is a platform that supports the implementation of the DPP. This European initiative aims to standardise the collection and exchange of data on products, facilitating their traceability and management throughout their life cycle. CIRPASS plays a crucial role in creating an interoperable digital framework that connects manufacturers, consumers and recyclers.

DPP applications in various sectors

On 5 March 2024, CIRPASS, in collaboration with the European Commission, organised an event on the future development of the Digital Product Passport. The event brought together various stakeholders from different industries and organisations, who, with an eminently practical approach presented and discussed various aspects of the upcoming regulation and its requirements, possible solutions, examples of use cases, and the obstacles and opportunities for the affected industries and businesses.

The following are the applications of DPP in various sectors as explained at the event:

Textile industry: It allows consumers to know the origin of the garments, the materials used and the working conditions in the factories.
Electronics: Facilitates recycling and reuse of components, reducing electronic waste.
Automotive: It assists in tracking parts and materials, promoting the repair and recycling of vehicles.
Power supply: It provides information on food traceability, ensuring safety and sustainability in the supply chain.

The impact of the DPP on citizens' lives

But what impact will the use of this kind of novel paradigm have on our daily lives? And how does this impact on us as end users of multiple products and services such as those mentioned above? We will focus on four base cases: informed consumers in any field, ease of product repair, trust and transparency, and efficient recycling.

The DPP provides consumers with access to detailed information about the products they buy, such as their origin, materials and production practices. This allows consumers to make more informed choices and opt for products that are sustainable and ethical. For example, a consumer can choose a garment made from organic materials and produced under fair labour conditions, thus promoting responsible and conscious consumption.

Similarly, one of the great benefits of the DPP is the inclusion of repair guides within the digital passport. This means that consumers can easily access detailed instructions on how to repair a product instead of discarding it when it breaks down. For example, if an appliance stops working, the DPP can provide a step-by-step repair manual, allowing the user to fix it himself or take it to a technician with the necessary information. This not only extends the lifetime of products, but also reduces e-waste and promotes sustainability.

Also, access to detailed and transparent product information through the DPP can increase consumers' trust in brands. Companies that provide a complete and accurate DPP demonstrate their commitment to transparency and accountability, which can enhance their reputation and build customer loyalty. In addition, consumers who have access to this information are better able to make responsible purchasing decisions, thus encouraging more ethical and sustainable consumption habits.

Finally, the DPP facilitates effective recycling by providing clear information on how to break down and reuse the materials in a product. For example, a citizen who wishes to recycle an electronic device can consult the DPP to find out which parts can be recycled and how to separate them properly. This improves the efficiency of the recycling process and ensures that more materials are recovered and reused instead of ending up in landfill, contributing to a circular economy.

Risks and challenges of the DPP

Similarly, as a novel technology and as part of the digital transformation that is taking place in the product sectors, the DPP also presents certain challenges, risks and challenges such as:

Data Protection: The collection and storage of large amounts of data can put consumers' privacy at risk if not properly managed.
Security: Digital data is vulnerable to cyber-attacks, which requires robust security measures.
Interoperability: Standardisation of data across different industries and countries can be complex, making it difficult to implement the DPP on a large scale.
Costs: Creating and maintaining digital passports can be costly, especially for small and medium-sized enterprises.

Data protection implications

The implementation of the DPP and systems such as CIRPASS implies careful management of personal data. It is essential that companies and digital platforms comply with data protection regulations, such as the EU's General Data Protection Regulation (GDPR). Organisations must ensure that the data collected is used in a transparent manner and with the explicit consent of consumers. In addition, advanced security measures must be implemented to protect the integrity and confidentiality of the data.

Relationship with European Data Spaces

The European Data Spaces are an EU initiative to create a single market for data, promoting innovation and the digital economy. The DPP and CIRPASS are aligned with this vision, as they encourage the exchange of information between different actors in the economy. Data interoperability is essential for the success of the European Data Spaces, and the DPP can contribute significantly to this goal by providing structured and accessible product data.

Conclusion

In conclusion, the Digital Product Passport and the CIRPASS initiative represent a significant step towards a more circular and sustainable economy. Through the collection and exchange of detailed product data, these systems can improve transparency, encourage responsible consumption practices and reduce environmental impact. However, their implementation requires overcoming challenges related to data protection, security and interoperability. As we move towards a more digitised future, the DPP and CIRPASS have the potential to transform the way we interact with products and contribute to a more sustainable world.

Content prepared by Dr. Fernando Gualo, Professor at UCLM and Data Governance and Quality Consultant The content and the point of view reflected in this publication are the sole responsibility of its author.

13/08/2024

Municipal Innovation through Open Data: Solutions to make municipal services more accessible

Documentación

The digital revolution is transforming municipal services, driven by the increasing adoption of artificial intelligence (AI) technologies that also benefit from open data. These developments have the potential to redefine the way municipalities deliver services to their citizens, providing tools to improve efficiency, accessibility and sustainability. This report looks at success stories in the deployment of applications and platforms that seek to improve various aspects of life in municipalities, highlighting their potential to unlock more of the vast untapped potential of open data and associated artificial intelligence technologies.

The applications and platforms described in this report have a high potential for replicability in different municipal contexts, as they address common problems. Replication of these solutions can take place through collaboration between municipalities, companies and developers, as well as through the release and standardisation of open data.

Despite the benefits, the adoption of open data for municipal innovation also presents significant challenges. The quality, updating and standardisation of data published by local authorities, as well as interoperability between different platforms and systems, must be ensured. In addition, the open data culture needs to be reinforced among all actors involved, including citizens, developers, businesses and public administrations themselves.

The use cases analysed are divided into four sections. Each of these sections is described below and some examples of the solutions included in the report are shown.

Transport and Mobility

One of the most significant challenges in urban areas is transport and mobility management. Applications using open data have proven to be effective in improving these services. For example, applications such as Park4Dis make it easy to locate parking spaces for people with reduced mobility, using data from multiple municipalities and contributions from volunteers. CityMapper, which has gone global, on the other hand, offers optimised public transport routes in real time, integrating data from various transport modes to provide the most efficient route. These applications not only improve mobility, but also contribute to sustainability by reducing congestion and carbon emissions.

Environment and Sustainability

Growing awareness of sustainability has spurred the development of applications that promote environmentally friendly practices. CleanSpot, for example, facilitates the location of recycling points and the management of municipal waste. The application encourages citizen participation in cleaning and recycling, contributing to the reduction of the ecological footprint. Liight gamifies sustainable behaviour by rewarding users for actions such as recycling or using public transport. These applications not only improve environmental management, but also educate and motivate citizens to adopt more sustainable habits.

Optimisation of Basic Public Services

Urban service management platforms, such as Gestdropper, use open data to monitor and control urban infrastructure in real time. These tools enable more efficient management of resources such as street lighting, water networks and street furniture, optimising maintenance, incident response and reducing operating costs. Moreover, the deployment of appointment management systems, such as CitaME, helps to reduce waiting times and improve efficiency in customer service.

Citizen Services Aggregators

Applications that centralise public information and services, such as Badajoz Es Más and AppValencia, improve accessibility and communication between administrations and citizens. These platforms provide real-time data on public transport, cultural events, tourism and administrative procedures, making life in the municipality easier for residents and tourists alike. For example, integrating multiple services into a single application improves efficiency and reduces the need for unnecessary travel. These tools also support local economies by promoting cultural events and commercial services.

Conclusions

The use of open data and artificial intelligence technologies is transforming municipal management, improving the efficiency, accessibility and sustainability of public services. The success stories presented in this report describe how these tools can benefit both citizens and public administrations by making cities smarter, more inclusive and sustainable environments, and more responsive to the needs and well-being of their inhabitants and visitors.

Download here the accesible version of the report

Listen the podcast (only availible in spanish)

Podcast video

24/07/2024

High-value meteorological datasets

Blog

The cross-cutting nature of open data on weather and climate data has favoured its use in areas as diverse as precision agriculture, fire prevention or the precision forestry. But the relevance of these datasets lies not only in their direct applicability across multiple industries, but also in their contribution to the challenges related to climate change and environmental sustainability challenges related to climate change and environmental sustainability, which the different action lines of the which the different action lines of the European Green Pact seek to address.

Meteorological data are considered by the European Commission, high value data in accordance with the annex to Regulation 2023/138. In this post we explain which specific datasets are considered to be of high value and the level of availability of this type of data in Spain.

The State Meteorological Agency

In Spain, it corresponds to the State Agency for Meteorology (AEMET) the mission of providing meteorological and climatological services at national level. As part of the Ministry for Ecological Transition and the Demographic Challenge. AEMET leads the related activities of observation, prediction and study of meteorological and climatic conditions, as well as research related to these fields. Its mission includes the provision and dissemination of essential information and forecasts of general interest. This information can also support relevant areas such as civil protection, air navigation, national defence and other sectors of activity.

In order to fulfil this mission, AEMET manages an open data portal that enables the reuse by natural or legal persons, for commercial or non-commercial purposes, of part of the data it generates, prepares and safeguards in the performance of its functions. This portal, known as AEMET OpenData currently offers two modalities for accessing and downloading data in reusable formats:

General access, which consists of graphical access for the general public through human-friendly interfaces.
AEMET OpenData API, designed for periodic or scheduled interactions in any programming language, which allows developers to include AEMET data in their own information systems and applications.

In addition, in accordance with Regulation 2023/138, it is envisaged to enable a third access route that would allow re-users to obtain packaged datasets for mass downloading where possible.

In order to access any of the datasets, an access key (API Key) which can be obtained through a simple request in which only an e-mail address is required, without any additional data from the applicant, for the sending of the access key. This is a control measure to ensure that the service is provided with adequate quality and in a non-discriminatory manner for all users.

AEMET OpenData also pioneered the availability of open meteorological data in Europe, reflecting AEMET''s commitment to the continuous improvement of meteorological services, support to the scientific and technological community, and the promotion of a more informed and resilient society in the face of climate challenges.

High-value meteorological datasets

The Annex to Regulation (EU) 2023/138 details five high-value meteorological data sets: weather station observations, validated weather data observations, weather warnings, radar data and numerical prediction model (NMP) data. For each of the sets, the regulation specifies the granularity and the main attributes to be published.

If we analyse the correspondence of the datasets that are currently available grouped in 14 categories in the portal AEMET OpenData portal, with the five datasets that will become mandatory in the coming months, we obtain the conclusions summarised in the following table:

High-value meteorological datasets	Equivalence in the AEMET OpenData datasets
Observation data measured by meteorological stations	The "Conventional Observation" dataset, generated by the Observing Service, provides a large number of hourly variables on liquid and solid precipitation, wind speed and direction, humidity, pressure, air, soil and subsoil temperature, visibility, etc. It is updated twice an hour. In accordance with the Regulation, ten-minute data shall be included with continuous updating.
Climate data: validated observations	Within the category "Climatological Values", four datasets on climate data observations are provided: "Daily climatologies", "Monthly/annual climatologies", "Normal values" and "Recorded extremes". The validated dataset provided by the National Climatological Data Bank Service is normally updated once a day with a delay of four days due to validation processes. Attributes available include daily mean temperature, daily precipitation in its standard 07:00 to 07:00 measurement form, daily mean relative humidity, maximum gust direction, etc. In accordance with the Regulation, the inclusion of hourly climatology is planned.
Weather warnings	Adverse weather warnings" are provided for the whole of Spain, or segmented by province or Autonomous Community. Both the latest issued and the historical ones since 2018. They provide data on observed and/or forecast severe weather events, from the present time until the next 72 hours. These warnings refer to each meteorological parameter by warning level, for each weather zone defined in the Meteoalert Plan. It is generated by the Adverse Events Functional Groups and the information is available any time an adverse weather event is issued, in line with the Regulation, which requires the dataset to be published "as issued or hourly". In this case, AEMET announces preferential broadcasting hours: 09:00, 11:30, 23:00 y 23:50.
Radar data	There are two sets of data: "Regional radar graphic image" and "National radar composition image", which provide reflectivity images, but not the others described in the Regulation (backscatter, polarisation, precipitation, wind and echotop). The dataset is generated by the Land Remote Sensing group and the information is available at a periodicity of 10 minutes instead of the 5 minutes recommended in the Regulation. However, according to the Strategic Plan 2022-2025 of the AEMET the updating of the 15 weather radars and the incorporation of new radars with higher resolution is foreseen, so that in addition to strengthening the early warning system, the obligations of the Regulation can be fulfilled.
PMN model data	There are several datasets with forecast information, some available for download and some available on the web: weather forecast, normalised text forecast, specific forecasts, maritime forecast and maps of weather variables maps of the HARMONIE-AROME numerical models for different geographical areas and time periods. However, the AEMET, according to their frequently asked questions document does not currently consider numerical model outputs as open data. AEMET offers the possibility of requesting this or any other dataset through the general register or through the electronic site but this is not an option provided for in the Regulation. In line with this, the inclusion of numerical atmospheric and wave model outputs is foreseen.

Figure 1: Table showing the equivalence between high value datasets and AEMET OpenData datasets.

The regulation also sets out a number of requirements for publication in terms of format, licence granted, frequency of updating and timeliness, means of access and metadata provided.

In the case of metadata, AEMET publishes, in machine-readable format, the main characteristics of the downloaded file: who prepares it, how often it is prepared, what it contains and its format, as well as information on the data fields (meteorological variable, unit of measurement, etc.). The copyright and terms of use are also specified by means of the legal notice. In this regard, it is foreseen that the current licences will be reviewed to make the datasets available under a licensing scheme compliant with the Regulation, possibly following the recommendation by adopting the license CC BY-SA 4.0.

All in all, it seems that the long track record of the State Meteorological Agency (AEMET) in providing quality open data has put it in a good position to comply with the requirements of the new regulation, making some adjustments to the datasets it already offers through AEMET OpenData to align them with the new obligations. AEMET plans to include in this service the datasets required by the Regulation and which are currently not available, as it adapts its regulations on public prices, as well as the infrastructure and systems that make this possible. Additional datasets that will be available will be ten-minute observation data, hourly climatologies and some data parameters from regional radars and numerical wave and forecast models.

Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The contents and views reflected in this publication are the sole responsibility of the author.

04/07/2024

How to Create an Expert Assistant with Open Data: Building GPT to Answer the Critical Minerals Challenge

Documentación

1. Introduction

In the information age, artificial intelligence has proven to be an invaluable tool for a variety of applications. One of the most incredible manifestations of this technology is GPT (Generative Pre-trained Transformer), developed by OpenAI. GPT is a natural language model that can understand and generate text, providing coherent and contextually relevant responses. With the recent introduction of Chat GPT-4, the capabilities of this model have been further expanded, allowing for greater customisation and adaptability to different themes.

In this post, we will show you how to set up and customise a specialised critical minerals wizard using GPT-4 and open data sources. As we have shown in previous publications critical minerals are fundamental to numerous industries, including technology, energy and defence, due to their unique properties and strategic importance. However, information on these materials can be complex and scattered, making a specialised assistant particularly useful.

The aim of this post is to guide you step by step from the initial configuration to the implementation of a GPT wizard that can help you to solve doubts and provide valuable information about critical minerals in your day to day life. In addition, we will explore how to customise aspects of the assistant, such as the tone and style of responses, to perfectly suit your needs. At the end of this journey, you will have a powerful, customised tool that will transform the way you access and use critical open mineral information.

Access the data lab repository on Github.

2. Context

The transition to a sustainable future involves not only changes in energy sources, but also in the material resources we use. The success of sectors such as energy storage batteries, wind turbines, solar panels, electrolysers, drones, robots, data transmission networks, electronic devices or space satellites depends heavily on access to the raw materials critical to their development. We understand that a mineral is critical when the following factors are met:

Its global reserves are scarce
There are no alternative materials that can perform their function (their properties are unique or very unique)
They are indispensable materials for key economic sectors of the future, and/or their supply chain is high risk

You can learn more about critical minerals in the post mentioned above.

3. Target

This exercise focuses on showing the reader how to customise a specialised GPT model for a specific use case. We will adopt a "learning-by-doing" approach, so that the reader can understand how to set up and adjust the model to solve a real and relevant problem, such as critical mineral expert advice. This hands-on approach not only improves understanding of language model customisation techniques, but also prepares readers to apply this knowledge to real-world problem solving, providing a rich learning experience directly applicable to their own projects.

The GPT assistant specialised in critical minerals will be designed to become an essential tool for professionals, researchers and students. Its main objective will be to facilitate access to accurate and up-to-date information on these materials, to support strategic decision-making and to promote education in this field. The following are the specific objectives we seek to achieve with this assistant:

Provide accurate and up-to-date information:
- The assistant should provide detailed and accurate information on various critical minerals, including their composition, properties, industrial uses and availability.
- Keep up to date with the latest research and market trends in the field of critical minerals.
Assist in decision-making:
- To provide data and analysis that can assist strategic decision making in industry and critical minerals research.
- Provide comparisons and evaluations of different minerals in terms of performance, cost and availability.
Promote education and awareness of the issue:
- Act as an educational tool for students, researchers and practitioners, helping to improve their knowledge of critical minerals.
- Raise awareness of the importance of these materials and the challenges related to their supply and sustainability.

4. Resources

To configure and customise our GPT wizard specialising in critical minerals, it is essential to have a number of resources to facilitate implementation and ensure the accuracy and relevance of the model''s responses. In this section, we will detail the necessary resources that include both the technological tools and the sources of information that will be integrated into the assistant''s knowledge base.

Tools and Technologies

The key tools and technologies to develop this exercise are:

OpenAI account: required to access the platform and use the GPT-4 model. In this post, we will use ChatGPT''s Plus subscription to show you how to create and publish a custom GPT. However, you can develop this exercise in a similar way by using a free OpenAI account and performing the same set of instructions through a standard ChatGPT conversation.
Microsoft Excel: we have designed this exercise so that anyone without technical knowledge can work through it from start to finish. We will only use office tools such as Microsoft Excel to make some adjustments to the downloaded data.

In a complementary way, we will use another set of tools that will allow us to automate some actions without their use being strictly necessary:

Google Colab: is a Python Notebooks environment that runs in the cloud, allowing users to write and run Python code directly in the browser. Google Colab is particularly useful for machine learning, data analysis and experimentation with language models, offering free access to powerful computational resources and facilitating collaboration and project sharing.
Markmap: is a tool that visualises Markdown mind maps in real time. Users write ideas in Markdown and the tool renders them as an interactive mind map in the browser. Markmap is useful for project planning, note taking and organising complex information visually. It facilitates understanding and the exchange of ideas in teams and presentations.

Sources of information

Raw Materials Information System (RMIS): raw materials information system maintained by the Joint Research Center of the European Union. It provides detailed and up-to-date data on the availability, production and consumption of raw materials in Europe.
International Energy Agency (IEA) Catalogue of Reports and Data: the International Energy Agency (IEA) offers a comprehensive catalogue of energy-related reports and data, including statistics on production, consumption and reserves of energy and critical minerals.
Mineral Database of the Spanish Geological and Mining Institute (BDMIN in its acronym in Spanish): contains detailed information on minerals and mineral deposits in Spain, useful to obtain specific data on the production and reserves of critical minerals in the country.

With these resources, you will be well equipped to develop a specialised GPT assistant that can provide accurate and relevant answers on critical minerals, facilitating informed decision-making in the field.

5. Development of the exercise

5.1. Building the knowledge base

For our specialised critical minerals GPT assistant to be truly useful and accurate, it is essential to build a solid and structured knowledge base. This knowledge base will be the set of data and information that the assistant will use to answer queries. The quality and relevance of this information will determine the effectiveness of the assistant in providing accurate and useful answers.

Search for Data Sources

We start with the collection of information sources that will feed our knowledge base. Not all sources of information are equally reliable. It is essential to assess the quality of the sources identified, ensuring that:

Information is up to date: the relevance of data can change rapidly, especially in dynamic fields such as critical minerals.
The source is reliable and recognised: it is necessary to use sources from recognised and respected academic and professional institutions.
Data is complete and accessible: it is crucial that data is detailed and accessible for integration into our wizard.

In our case, we developed an online search in different platforms and information repositories trying to select information belonging to different recognised entities:

Research centres and universities:
- They publish detailed studies and reports on the research and development of critical minerals.
- Example: RMIS of the Joint Research Center of the European Union.
Governmental institutions and international organisations:
- These entities usually provide comprehensive and up-to-date data on the availability and use of critical minerals.
- Example: International Energy Agency (IEA).
Specialised databases:
- They contain technical and specific data on deposits and production of critical minerals.
- Example: Minerals Database of the Spanish Geological and Mining Institute (BDMIN).

Selection and preparation of information

We will now focus on the selection and preparation of existing information from these sources to ensure that our GPT assistant can access accurate and useful data.

RMIS of the Joint Research Center of the European Union:

Selected information:

We selected the report "Supply chain analysis and material demand forecast in strategic technologies and sectors in the EU - A foresight study". This is an analysis of the supply chain and demand for minerals in strategic technologies and sectors in the EU. It presents a detailed study of the supply chains of critical raw materials and forecasts the demand for minerals up to 2050.

Necessary preparation:

The format of the document, PDF, allows the direct ingestion of the information by our assistant. However, as can be seen in Figure 1, there is a particularly relevant table on pages 238-240 which analyses, for each mineral, its supply risk, typology (strategic, critical or non-critical) and the key technologies that employ it. We therefore decided to extract this table into a structured format (CSV), so that we have two pieces of information that will become part of our knowledge base.

Table of minerals contained in the JRC PDF

Figure 1: Table of minerals contained in the JRC PDF

To programmatically extract the data contained in this table and transform it into a more easily processable format, such as CSV(comma separated values), we will use a Python script that we can use through the platform Google Colab platform (Figure 2).

Python script for the extraction of data from the JRC PDF developed on the Google Colab platform.

Figure 2: Script Python para la extracción de datos del PDF de JRC desarrollado en plataforma Google Colab.

To summarise, this script:

It is based on the open source library PyPDF2capable of interpreting information contained in PDF files.
First, it extracts in text format (string) the content of the pages of the PDF where the mineral table is located, removing all the content that does not correspond to the table itself.
It then goes through the string line by line, converting the values into columns of a data table. We will know that a mineral is used in a key technology if in the corresponding column of that mineral we find a number 1 (otherwise it will contain a 0).
Finally, it exports the table to a CSV file for further use.

International Energy Agency (IEA):

Selected information:

We selected the report "Global Critical Minerals Outlook 2024". It provides an overview of industrial developments in 2023 and early 2024, and offers medium- and long-term prospects for the demand and supply of key minerals for the energy transition. It also assesses risks to the reliability, sustainability and diversity of critical mineral supply chains.

Necessary preparation:

The format of the document, PDF, allows us to ingest the information directly by our virtual assistant. In this case, we will not make any adjustments to the selected information.

Spanish Geological and Mining Institute''s Minerals Database (BDMIN)

Selected information:

In this case, we use the form to select the existing data in this database for indications and deposits in the field of metallogeny, in particular those with lithium content.

Dataset selection in BDMIN.

Figure 3: Dataset selection in BDMIN.

Necessary preparation:

We note how the web tool allows online visualisation and also the export of this data in various formats. Select all the data to be exported and click on this option to download an Excel file with the desired information.

BDMIN Visualization and Download Tool

Figure 4: Visualization and download tool in BDMIN

Figure 5: BDMIN Downloaded Data.

All the files that make up our knowledge base can be found at GitHub, so that the reader can skip the downloading and preparation phase of the information.

5.2. GPT configuration and customisation for critical minerals

When we talk about "creating a GPT," we are actually referring to the configuration and customisation of a GPT (Generative Pre-trained Transformer) based language model to suit a specific use case. In this context, we are not creating the model from scratch, but adjusting how the pre-existing model (such as OpenAI''s GPT-4) interacts and responds within a specific domain, in this case, on critical minerals.

First of all, we access the application through our browser and, if we do not have an account, we follow the registration and login process on the ChatGPT platform. As mentioned above, in order to create a GPT step-by-step, you will need to have a Plus account. However, readers who do not have such an account can work with a free account by interacting with ChatGPT through a standard conversation.

Screenshot of the ChatGPT login and registration page.

Figure 6: ChatGPT login and registration page.

Once logged in, select the "Explore GPT" option, and then click on "Create" to begin the process of creating your GPT.

Screenshot of the creation page of a new GPT.

Figure 7: Creation of new GPT.

The screen will display the split screen for creating a new GPT: on the left, we will be able to talk to the system to indicate the characteristics that our GPT should have, while on the left we will be able to interact with our GPT to validate that its behaviour is adequate as we go through the configuration process.

Screenshot of the new GPT creation screen.

Figure 8: Screen of creating new GPT.

In the GitHub of this project, we can find all the prompts or instructions that we will use to configure and customise our GPT and that we will have to introduce sequentially in the "Create" tab, located on the left tab of our screens, to complete the steps detailed below.

The steps we will follow for the creation of the GPT are as follows:

First, we will outline the purpose and basic considerations for our GPT so that you can understand how to use it.

Capture the basic instructions of GPT again.

Figure 9: Basic instructions for new GPT.

2. We will then create a name and an image to represent our GPT and make it easily identifiable. In our case, we will call it MateriaGuru.

Screenshot for name selection for new GPT.

Figure 10: Name selection for new GPT.

Screenshot for image creation for GPT.

Figure 11: Image creation for GPT.

3.We will then build the knowledge base from the information previously selected and prepared to feed the knowledge of our GPT.

Capturing the information upload to the knowledge base of new GPT

Capturing the Knowledge Base Load of New GPT II

Figure 12: Uploading of information to the new GPT knowledge base.

4. Now, we can customise conversational aspects such as their tone, the level of technical complexity of their response or whether we expect brief or elaborate answers.

5. Lastly, from the "Configure" tab, we can indicate the conversation starters desired so that users interacting with our GPT have some ideas to start the conversation in a predefined way.

Screenshot of the Configure GPT tab.

Figure 13: Configure GPT tab.

In Figure 13 we can also see the final result of our training, where key elements such as their image, name, instructions, conversation starters or documents that are part of their knowledge base appear.

5.3. Validation and publication of GPT

Before we sign off our new GPT-based assistant, we will proceed with a brief validation of its correct configuration and learning with respect to the subject matter around which we have trained it. For this purpose, we prepared a battery of questions that we will ask MateriaGuru to check that it responds appropriately to a real scenario of use.

#	Question	Answer
1	Which critical minerals have experienced a significant drop in prices in 2023?	Battery mineral prices saw particularly large drops with lithium prices falling by 75% and cobalt, nickel and graphite prices falling by between 30% and 45%.
2	What percentage of global solar photovoltaic (PV) capacity was added by China in 2023?	China accounted for 62% of the increase in global solar PV capacity in 2023.
3	What is the scenario that projects electric car (EV) sales to reach 65% by 2030?	The Net Zero Emissions (NZE) scenario for 2050 projects that electric car sales will reach 65% by 2030.
4	What was the growth in lithium demand in 2023?	Lithium demand increased by 30% in 2023.
5	Which country was the largest electric car market in 2023?	China was the largest electric car market in 2023 with 8.1 million electric car sales representing 60% of the global total.
6	What is the main risk associated with market concentration in the battery graphite supply chain?	More than 90% of battery-grade graphite and 77% of refined rare earths in 2030 originate in China, posing a significant risk to market concentration.
7	What proportion of global battery cell production capacity was in China in 2023?	China owned 85% of battery cell production capacity in 2023.
8	How much did investment in critical minerals mining increase in 2023?	Investment in critical minerals mining grew by 10% in 2023.
9	What percentage of battery storage capacity in 2023 was composed of lithium iron phosphate (LFP) batteries?	By 2023, LFP batteries would constitute approximately 80% of the total battery storage market.
10	What is the forecast for copper demand in a net zero emissions (NZE) scenario for 2040?	In the net zero emissions (NZE) scenario for 2040, copper demand is expected to have the largest increase in terms of production volume.

Figure 14: Table with battery of questions for the validation of our GPT.

Using the preview section on the right-hand side of our screens, we launch the battery of questions and validate that the answers correspond to those expected.

Capture of the GPT response validation process.

Figure 15: Validation of GPT responses.

Finally, click on the "Create" button to finalise the process. We will be able to select between different alternatives to restrict its use by other users.

Screenshot for publication of our GPT.

Figure 16: Publication of our GPT.

6. Scenarios of use

In this section we show several scenarios in which we can take advantage of MateriaGuru in our daily life. On the GitHub of the project you can find the prompts used to replicate each of them.

6.1. Consultation of critical minerals information

The most typical scenario for the use of this type of GPTs is assistance in resolving doubts related to the topic in question, in this case, critical minerals. As an example, we have prepared a set of questions that the reader can pose to the GPT created to understand in more detail the relevance and current status of a critical material such as graphite from the reports provided to our GPT.

Capture of the process of resolving critical mineral doubts.

Figure 17: Resolution of critical mineral queries.

We can also ask you specific questions about the tabulated information provided on existing sites and evidence on Spanish territory.

Screenshot of the answer to the question about lithium reserves in Extremadura.

Figure 18: Lithium reserves in Extremadura.

6.2. Representation of quantitative data visualisations

Another common scenario is the need to consult quantitative information and make visual representations for better understanding. In this scenario, we can see how MateriaGuru is able to generate an interactive visualisation of graphite production in tonnes for the main producing countries.

Capture of the interactive visualization generated with our GPT.

Figure 19: Interactive visualisation generation with our GPT.

6.3. Generating mind maps to facilitate understanding

Finally, in line with the search for alternatives for a better access and understanding of the existing knowledge in our GPT, we will propose to MateriaGuru the construction of a mind map that allows us to understand in a visual way key concepts of critical minerals. For this purpose, we use the open Markmap notation (Markdown Mindmap), which allows us to define mind maps using markdown notation.

Capture of the process for generating mind maps from our GPT.

Figure 20: Generation of mind maps from our GPT

We will need to copy the generated code and enter it in a markmapviewer in order to generate the desired mind map. We facilitate here a version of this code generated by MateriaGuru.

Capturing Mind Map Visualization

Figure 21: Visualisation of mind maps.

7. Results and conclusions

In the exercise of building an expert assistant using GPT-4, we have succeeded in creating a specialised model for critical minerals. This wizard provides detailed and up-to-date information on critical minerals, supporting strategic decision making and promoting education in this field. We first gathered information from reliable sources such as the RMIS, the International Energy Agency (IEA), and the Spanish Geological and Mining Institute (BDMIN). We then process and structure the data appropriately for integration into the model. Validations showed that the wizard accurately answers domain-relevant questions, facilitating access to your information.

In this way, the development of the specialised critical minerals assistant has proven to be an effective solution for centralising and facilitating access to complex and dispersed information.

The use of tools such as Google Colab and Markmap has enabled better organisation and visualisation of data, increasing efficiency in knowledge management. This approach not only improves the understanding and use of critical mineral information, but also prepares users to apply this knowledge in real-world contexts.

The practical experience gained in this exercise is directly applicable to other projects that require customisation of language models for specific use cases.

8. Do you want to do the exercise?

If you want to replicate this exercise, access this this repository where you will find more information (the prompts used, the code generated by MateriaGuru, etc.)

Also, remember that you have at your disposal more exercises in the section "Step-by-step visualisations".

Content elaborated by Juan Benavente, industrial engineer and expert in technologies linked to the data economy. The contents and points of view reflected in this publication are the sole responsibility of the author.

01/07/2024

Open geographic data applications of the National Centre for Geographic Information (NICHD)

Blog

The National Centre for Geographic Information publishes open geospatial data from the National Cartographic System, the National Geographic Institute and other organisations through web applications and mobile applications to facilitate access to and consultation of geographic data by citizens.

Geospatial data is published via web services and APIs for reuse, so in the case of high-value datasets, it can be used in a variety of ways high-value datasets such as geographic names, hydrography or addresses as required by the as required by the EUthe EU has already made these datasets available to the public by June 2024 as they are associated with major benefits for society, the environment and the economy.

But in the applications listed below, the geographic data are visualised and consulted through web services, so that for downloading the data, it is possible to use web services and APIs directly, through a platform accessible to any user with a wide range of geographic information, ranging from topographic maps to satellite images.

But not only data can be reused, also application software is reusable, for example, the Solar Energy Potential of Buildings visualiser which is based on a visualiser API, named API-CNIG and allows the same tool to be used for different thematic areas.

Some examples of applications are:

Solar Energy Potential of Buildings

Provides the photovoltaic capacity of a building according to its location and characteristics. It also provides the average over the year and a point grid to identify the best location for solar panels.

National Geographic Gazetteer

It is a toponym search engine that collects the names, official or standardised by the corresponding competent bodies , with geographical references.

Unified postal address calculator

It is a converter that allows to know the geographical coordinates (latitude and longitude in WGS84) of the postal addresses of a place, and vice versa. In both cases, the input file is a CSV file, supporting both coordinates and postal addresses.

Basic Maps of Spain

It facilitates connection to IGN services and to the CNIG download centre to obtain maps and routes. With this mobile application you can follow the routes of the National Parks or the stages of the Camino de Santiago. It allows you to plan excursions using maps, navigate and take guided tours, without the need for an internet connection after downloading data.

Map a la carte

It allows you to create a customised map using the printed series of the National Topographic Map at scales 1:25.000 and 1:50.000. It offers the possibility of defining its area, incorporating contents, personalising the cover, obtaining a pdf file and even acquiring paper copies by post.

IGN Earthquakes

It allows the reception and visualisation of all seismic events in Spain and its surroundings. It provides the distance to the epicentre of the seismic event and epicentral parameters, as well as the geolocation of the user's position and the epicentre.

Maps of Spain

It is a free mobile viewer ideal for hiking, cycling, running, skiing, etc., which uses as background cartography the services of the National Geographic Institute and another set of services from other Ministries, such as the Cadastral information of the plots provided by the General Directorate of Cadastre.

Camino de Santiago

It includes information of a cultural and practical nature on each of the stages (hostels, monuments, etc.), as well as a complete Pilgrim's Guide detailing what you should know before starting out on any of the routes. This application is based on ESRI software.

National Parks

Displays information on the history, fauna, flora and excursions in Spain's National Parks. It includes hundreds of points of interest such as information centres, accommodation, viewpoints, refuges and even routes through the parks, indicating their duration and difficulty. The app is available for download on Android e iOS. This application is based on ESRI software.

GeoSapiens IGN

It presents interactive maps, free to use and free of charge, to study the physical and political geography of Spain and the world. It consists of different games relating to the whole of Spain or by autonomous communities, the whole world and by continent.

In addition to the applications developed by the CNIG, which are also presented in this video this videothere are many other digital solutions developed by third parties that reuse open geospatial data to offer a service to society. For example, in the list of data reusing applications.gob.es you can find from a map that shows the fires that are active in Spain in real time in Spain in real time to an app that shows where the parking spaces for people with reduced mobility parking spaces for people with reduced mobility in each town.

In short, anyone can make use of the open geographic data of the National Cartographic System, the National Geographic Institute and other bodies published by the CNIG , thus extending the advantages offered by the availability of open geographic data. do you know of any other application resulting from the reuse of open data? You can send it to us at dinamizacion@datos.gob.es

31/05/2024