Blog

Data literacy has become a crucial issue in the digital age. This concept refers to the ability of people to understand how data is used, how it is accessed, created, analysed, used or reused, and communicated.

We live in a world where data and algorithms influence everyday decisions and the opportunities people have to live well. Its effect can be felt in areas ranging from advertising and employment provision to criminal justice and social welfare. It is therefore essential to understand how data is generated and used.

Data literacy can involve many areas, but we will focus on its relationship with digital rights on the one hand and Artificial Intelligence (AI) on the other. This article proposes to explore the importance of data literacy for citizenship, addressing its implications for the protection of individual and collective rights and the promotion of a more informed and critical society in a technological context where artificial intelligence is becoming increasingly important.

The context of digital rights

More and more studies studies increasingly indicate that effective participation in today's data-driven, algorithm-driven society requires data literacy indicating that effective participation in today's data-driven, algorithm-driven society requires data literacy. Civil rights are increasingly translating into digital rights as our society becomes more dependent on digital technologies and environments digital rights as our society becomes more dependent on digital technologies and environments. This transformation manifests itself in various ways:

  • On the one hand, rights recognised in constitutions and human rights declarations are being explicitly adapted to the digital context. For example, freedom of expression now includes freedom of expression online, and the right to privacy extends to the protection of personal data in digital environments. Moreover, some traditional civil rights are being reinterpreted in the digital context. One example of this is the right to equality and non-discrimination, which now includes protection against algorithmic discrimination and against bias in artificial intelligence systems. Another example is the right to education, which now also extends to the right to digital education. The importance of digital skills in society is recognised in several legal frameworks and documents, both at national and international level, such as the Organic Law 3/2018 on Personal Data Protection and Guarantee of Digital Rights (LOPDGDD) in Spain. Finally, the right of access to the internet is increasingly seen as a fundamental right, similar to access to other basic services.
  • On the other hand, rights are emerging that address challenges unique to the digital world, such as the right to be forgotten (in force in the European Union and some other countries that have adopted similar legislation1), which allows individuals to request the removal of personal information available online, under certain conditions. Another example is the right to digital disconnection (in force in several countries, mainly in Europe2), which ensures that workers can disconnect from work devices and communications outside working hours. Similarly, there is a right to net neutrality to ensure equal access to online content without discrimination by service providers, a right that is also established in several countries and regions, although its implementation and scope may vary. The EU has regulations that protect net neutrality, including Regulation 2015/2120, which establishes rules to safeguard open internet access. The Spanish Data Protection Act provides for the obligation of Internet providers to provide a transparent offer of services without discrimination on technical or economic grounds. Furthermore, the right of access to the internet - related to net neutrality - is recognised as a human right by the United Nations (UN).

This transformation of rights reflects the growing importance of digital technologies in all aspects of our lives.

The context of artificial intelligence

The relationship between AI development and data is fundamental and symbiotic, as data serves as the basis for AI development in a number of ways:

  1. Data is used to train AI algorithms, enabling them to learn, detect patterns, make predictions and improve their performance over time.
  2. The quality and quantity of data directly affect the accuracy and reliability of AI systems. In general, more diverse and complete datasets lead to better performing AI models.
  3. The availability of data in various domains can enable the development of AI systems for different use cases.

Data literacy has therefore become increasingly crucial in the AI era, as it forms the basis for effectively harnessing and understanding AI technologies.

In addition, the rise of big data and algorithms has transformed the mechanisms of participation, presenting both challenges and opportunities. Algorithms, while they may be designed to be fair, often reflect the biases of their creators or the data they are trained on. This can lead to decisions that negatively affect vulnerable groups.

In this regard, legislative and academic efforts are being made to prevent this from happening. For example, the EuropeanArtificial Intelligence Act (AI Act) includes safeguards to avoid harmful biases in algorithmic decision-making. For example, it classifies AI systems according to their level of potential risk and imposes stricter requirements on high-risk systems. In addition, it requires the use of high quality data to train the algorithms, minimising bias, and provides for detailed documentation of the development and operation of the systems, allowing for audits and evaluations with human oversight. It also strengthens the rights of persons affected by AI decisions, including the right to challenge decisions made and their explainability, allowing affected persons to understand how a decision was reached.

The importance of digital literacy in both contexts

Data literacy helps citizens make informed decisions and understand the full implications of their digital rights, which are also considered, in many respects, as mentioned above, to be universal civil rights. In this context, data literacy serves as a critical filter for full civic participation that enables citizens to influence political and social decisions full civic participation that enables citizens to influence political and social decisions. That is,those who have access to data and the skills and tools to navigate the data infrastructure effectively can intervene and influencepolitical and social processes in a meaningful way , something which promotes the Open Government Partnership.

On the other hand, data literacy enables citizens to question and understand these processes, fostering a culture of accountability and transparency in the use of AI.  There arealso barriers to participation in data-driven environments. One of these barriers is the digital divide (i.e. deprivation of access to infrastructure, connectivity and training, among others) and, indeed, lack of data literacy. The latter is therefore a crucial concept for overcoming the challenges posed by datification datification of human relations and the platformisation of content and services.

Recommendations for implementing a preparedness partnership

Part of the solution to addressing the challenges posed by the development of digital technology is to include data literacy in educational curricula from an early age.

This should cover:

  • Data basics: understanding what data is, how it is collected and used.
  • Critical analysis: acquisition of the skills to evaluate the quality and source of data and to identify biases in the information presented. It seeks to recognise the potential biases that data may contain and that may occur in the processing of such data, and to build capacity to act in favour of open data and its use for the common good.
  • Rights and regulations: information on data protection rights and how European laws affect the use of AI. This area would cover all current and future regulation affecting the use of data and its implication for technology such as AI.
  • Practical applications: the possibility of creating, using and reusing open data available on portals provided by governments and public administrations, thus generating projects and opportunities that allow people to work with real data, promoting active, contextualised and continuous learning.

By educating about the use and interpretation of data, it fosters a more critical society that is able to demand accountability in the use of AI. New data protection laws in Europe provide a framework that, together with education, can help mitigate the risks associated with algorithmic abuse and promote ethical use of technology. In a data-driven society, where data plays a central role, there is a need to foster data literacy in citizens from an early age.

1The right to be forgotten was first established in May 2014 following a ruling by the Court of Justice of the European Union. Subsequently, in 2018, it was reinforced with the General Data Protection Regulation (GDPR)which explicitly includes it in its Article 17 as a "right of erasure". In July 2015, Russia passed a law allowing citizens to request the removal of links on Russian search engines if the information"violates Russian law or if it is false or outdated". Turkey has established its own version of the right to be forgotten, following a similar model to that of the EU. Serbia has also implemented a version of the right to be forgotten in its legislation. In Spain, the Ley Orgánica de Protección de Datos Personales (LOPD) regulates the right to be forgotten, especially with regard to debt collection files. In the United Statesthe right to be forgotten is considered incompatible with the Constitution, mainly because of the strong protection of freedom of expression. However, there are some related regulations, such as the Fair Credit Reporting Act of 1970, which allows in certain situations the deletion of old or outdated information in credit reports.

2Some countries where this right has been established include Spain, regulated by Article 88 of Organic Law 3/2018 on Personal Data Protection; France, which, in 2017, became the first country to pass a law on the right to digital disconnection; Germany, included in the Working Hours and Rest Time Act(Arbeitszeitgesetz); Italy, under Law 81/201; and Belgium. Outside Europe, it is, for example, in Chile.


Content prepared by Miren Gutiérrez, PhD and researcher at the University of Deusto, expert in data activism, data justice, data literacy and gender disinformation. The contents and views reflected in this publication are the sole responsibility of the author.

calendar icon
Noticia

Digital transformation has become a fundamental pillar for the economic and social development of countries in the 21st century. In Spain, this process has become particularly relevant in recent years, driven by the need to adapt to an increasingly digitalised and competitive global environment. The COVID-19 pandemic acted as a catalyst, accelerating the adoption of digital technologies in all sectors of the economy and society.

However, digital transformation involves not only the incorporation of new technologies, but also a profound change in the way organisations operate and relate to their customers, employees and partners. In this context, Spain has made significant progress, positioning itself as one of the leading countries in Europe in several aspects of digitisation.

The following are some of the most prominent reports analysing this phenomenon and its implications.

State of the Digital Decade 2024 report

The State of the Digital Decade 2024 report examines the evolution of European policies aimed at achieving the agreed objectives and targets for successful digital transformation. It assesses the degree of compliance on the basis of various indicators, which fall into four groups: digital infrastructure, digital business transformation, digital skills and digital public services.

Assessment of progress towards the Digital Decade objectives set for 2030. European KPIs for 2024. Digital infrastructure. 1.1. Overall 5G coverage: 89% achieved; target: 100% coverage. 1.2. 5G coverage at 3.4-3.8GHz (not a KPI, but gives an important indication of high quality 5G coverage): achieved 89%; target: 100% coverage. 1.3. Fiber to the premises (FTTP: achieved 64%; target: 100% coverage. 1.4. Very high capacity fixed network: achieved 79%; target: 100% coverage.  1.5. Semiconductors: reached 55%; target: 20% of global production.  1.6. Edge nodes: reached 1186; target: 10,000. 1.7. Quantum computing: 1 by 2024; target: 3 quantum computers. 2. Digital transformation of businesses. 2.1 Digital intensity of SMEs: reached 64%; target: 90% SMEs. 2.2. Adoption of the cloud: reached 52%; target: 75% of companies. 2.3 Adoption of Big Data (The former Big Data indicator is now replaced by the adoption of data analytics technologies. Progress is not fully comparable) achieved 44%; target: 75% companies. 2.4. Adoption of AI: achieved 11%; target: 75% companies. 2.5. Unicorns. achieved 53%; target: 498 (2x the 2022 baseline). 3. Digital capabilities. 3.1. Basic digital skills: achieved 64%; target: 80% of individuals. 3.2. ICT specialists: reached 48%; target: 20 million employees. Digital public services. 4.1 Digital public services for citizens: achieved 79%; target: Rating/100. 4.2. Digital public services for businesses: achieved 85%; target: Rating/100. 4.3. Access to electronic health records: achieved 79%; target: Rating/100. 4.4. 4.4. Electronic identification (eID): 85% achieved; target: 27 million with eID reported.  *Not a KPI, but gives an important indication of high quality 5G coverage.  Source: State of the Digital Decade 2024 Report.

Figure 1. Taking stock of progress towards the Digital Decade goals set for 2030, “State of the Digital Decade 2024 Report”, European Commission.

In recent years, the European Union (EU) has significantly improved its performance by adopting regulatory measures - with 23 new legislative developments, including, among others, the Data Governance Regulation and the Data Regulation- to provide itself with a comprehensive governance framework: the Digital Decade Policy Agenda 2030.

The document includes an assessment of the strategic roadmaps of the various EU countries. In the case of Spain, two main strengths stand out:

  • Progress in the use of artificial intelligence by companies (9.2% compared to 8.0% in Europe), where Spain's annual growth rate (9.3%) is four times higher than the EU (2.6%).
  • The large number of citizens with basic digital skills (66.2%), compared to the European average (55.6%).

On the other hand, the main challenges to overcome are the adoption of cloud services ( 27.2% versus 38.9% in the EU) and the number of ICT specialists ( 4.4% versus 4.8% in Europe).

The following image shows the forecast evolution in Spain of the key indicators analysed for 2024, compared to the targets set by the EU for 2030.

Key performance indicators for Spain. Shows the target set for 2024 (Country coverage, % of EU target.) Data for 2023 and projections to 2030 can be seen in the source) . 1. Very high capacity fixed network: 97%. 2. Fiber-to-the-premises (FTTP): 96%. 3. Overall 5G coverage: 98.9%. 4. Edge nodes: no data. 5. Digital intensity of SMEs: 68.3%. 6. Cloud: 47.3%. 7. Data analytics: 45.9%. 8. Artificial intelligence: 14.1%. 9. Unicorns: 61.5%. 10. Basic digital capabilities: 83.6%. 11. ICT specialists: 50%. 12. Digital public services for citizens: 88.7%. 13. Digital public services for businesses: 95%. 14. Digital health: 87.3%.  Source: State of the Digital Decade 2024 Report.

Figure 2. Key performance indicators for Spain, “Report on the State of the Digital Decade 2024”, European Commission.

Spain is expected to reach 100% on virtually all indicators by 2030.  26.7 billion (1.8 % of GDP), without taking into account private investments. This roadmap demonstrates the commitment to achieving the goals and targets of the Digital Decade.

In addition to investment, to achieve the objective, the report recommends focusing efforts in three areas: the adoption of advanced technologies (AI, data analytics, cloud) by SMEs; the digitisation and promotion of the use of public services; and the attraction and retention of ICT specialists through the design of incentive schemes.

European Innovation Scoreboard 2024

The European Innovation Scoreboard carries out an annual benchmarking of research and innovation developments in a number of countries, not only in Europe. The report classifies regions into four innovation groups, ranging from the most innovative to the least innovative: Innovation Leaders, Strong Innovators, Moderate Innovators and Emerging Innovators.

Spain is leading the group of moderate innovators, with a performance of 89.9% of the EU average. This represents an improvement compared to previous years and exceeds the average of other countries in the same category, which is 84.8%. Our country is above the EU average in three indicators: digitisation, human capital and financing and support. On the other hand, the areas in which it needs to improve the most are employment in innovation, business investment and innovation in SMEs. All this is shown in the following graph:

Blocks that make up the synthetic index of innovation in Spain. Score in relation to the EU-27 average in 2024 (=100). 1. Digitalization: 145.4%. Human capital: 124.6%. 3. Financing and support: 104.4%. 4. Environmental sustainability: 99.2%. 5. Collaboration with the system: 96.0%. 6. Attractive research systems: 90.5%. 7. impact of innovation on sales: 90.2%. 8. Use of ICT: 89.2%. 9. Products and exports: 82.7%. 10. Employment of innovation: 62.7%. Business investment: 62.6%. 12. innovation in SMEs: 53.9%. Source: European Innovation Scorecard 2024 (adapted from the COTEC Foundation).

Figure 3. Blocks that make up the synthetic index of innovation in Spain, European Innovation Scorecard 2024 (adapted from the COTEC Foundation).

Spain's Digital Society Report 2023

The Telefónica Foundation also periodically publishes a report  which analyses the main changes and trends that our country is experiencing as a result of the technological revolution.

The edition currently available is the 2023 edition. It highlights that "Spain continues to deepen its digital transformation process at a good pace and occupies a prominent position in this aspect among European countries", highlighting above all the area of connectivity. However, digital divides remain, mainly due to age.

Progress is also being made in the relationship between citizens and digital administrations: 79.7% of people aged 16-74 used websites or mobile applications of an administration in 2022. On the other hand, the Spanish business fabric is advancing in its digitalisation, incorporating digital tools, especially in the field of marketing. However, there is still room for improvement in aspects of big data analysis and the application of artificial intelligence, activities that are currently implemented, in general, only by large companies.

Artificial Intelligence and Data Talent Report

IndesIA, an association that promotes the use of artificial intelligence and Big Data in Spain, has carried out a quantitative and qualitative analysis of the data and artificial intelligence talent market in 2024 in our country.

According to the report, the data and artificial intelligence talent market represents almost 19% of the total number of ICT professionals in our country. In total, there are 145,000 professionals (+2.8% from 2023), of which only 32% are women. Even so, there is a gap between supply and demand, especially for natural language processing engineers. To address this situation, the report analyses six areas for improvement: workforce strategy and planning, talent identification, talent activation, engagement, training and development, and data-driven culture .

Other reports of interest

 The COTEC Foundation also regularly produces various reports on the subject. On its website we can find documents on the budget execution of R&D in the public sector, the social perception of innovation or the regional talent map.

For their part, the Orange Foundation in Spain and the consultancy firm Nae have produced a report to analyse digital evolution over the last 25 years, the same period that the Foundation has been operating in Spain. The report highlights that, between 2013 and 2018, the digital sector has contributed around €7.5 billion annually to the country's GDP.

In short, all of them highlight Spain's position among the European leaders in terms of digital transformation, but with the need to make progress in innovation. This requires not only boosting economic investment, but also promoting a cultural change that fosters creativity. A more open and collaborative mindset will allow companies, administrations and society in general to adapt quickly to technological changes and take advantage of the opportunities they bring to ensure a prosperous future for Spain.

Do you know of any other reports on the subject? Leave us a comment or write to us at dinamizacion@datos.gos.es.

calendar icon
Blog

Digital transformation has reached almost every aspect and sector of our lives, and the world of products and services is no exception. In this context, the Digital Product Passport (DPP) concept is emerging as a revolutionary tool to foster sustainability and the circular economy. Accompanied by initiatives such as CIRPASS (Circular Product Information System for Sustainability), the DPP promises to change the way we interact with products throughout their life cycle. In this article, we will explore what DPP is, its origins, applications, risks and how it can affect our daily lives and the protection of our personal data.

What is the Digital Product Passport (DPP)? Origin and importance

The Digital Product Passport is a digital collection of key information about a product, from manufacturing to recycling. This passport allows products to be tracked and managed more efficiently, improving transparency and facilitating sustainable practices. The information contained in a DPP may include details on the materials used, the manufacturing process, the supply chain, instructions for use and how to recycle the product at the end of its life.

The DPP has been developed in response to the growing need to promote the circular economy and reduce the environmental impact of products. The European Union (EU) has been a pioneer in promoting policies and regulations that support sustainability. Initiatives such as the EU's Circular Economy Action Plan have been instrumental in driving the DPP forward. The objectives of this plan are as follows:

  • Greater Transparency: Consumers no longer have to guess about the origin of their products and how to dispose of them correctly. With a machine-readable DPP (e.g. QR code or NFC tag) attached to end products, consumers can make informed purchasing decisions and brands can eliminate greenwashing with confidence.
  • Simplified Compliance: By creating an audit of events and transactions in a product's value chain, the DPP provides the brand and its suppliers with the necessary data to address compliance demands efficiently.
  • Sustainable Production: By tracking and reporting the social and environmental impacts of a product from source to disposal, brands can make data-driven decisions to optimise sustainability in product development.
  • Circular Economy: The DPP facilitates a circular economy by promoting eco-design and the responsible production of durable products that can be reused, remanufactured and disposed of correctly.

The following image summarises the main advantages of the digital passport at each stage of the digital product manufacturing process:

CIRPASS as a facilitator of DPP implementation

CIRPASS is a platform that supports the implementation of the DPP. This European initiative aims to standardise the collection and exchange of data on products, facilitating their traceability and management throughout their life cycle. CIRPASS plays a crucial role in creating an interoperable digital framework that connects manufacturers, consumers and recyclers.

DPP applications in various sectors

On 5 March 2024, CIRPASS, in collaboration with the European Commission, organised an event on the future development of the Digital Product Passport. The event brought together various stakeholders from different industries and organisations, who, with an eminently practical approach presented and discussed various aspects of the upcoming regulation and its requirements, possible solutions, examples of use cases, and the obstacles and opportunities for the affected industries and businesses.

The following are the applications of DPP in various sectors as explained at the event:

  1. Textile industry: It allows consumers to know the origin of the garments, the materials used and the working conditions in the factories.
  2. Electronics: Facilitates recycling and reuse of components, reducing electronic waste.
  3. Automotive: It assists in tracking parts and materials, promoting the repair and recycling of vehicles.
  4. Power supply: It provides information on food traceability, ensuring safety and sustainability in the supply chain.

 The impact of the DPP on citizens' lives

But what impact will the use of this kind of novel paradigm have on our daily lives? And how does this impact on us as end users of multiple products and services such as those mentioned above? We will focus on four base cases: informed consumers in any field, ease of product repair, trust and transparency, and efficient recycling.

The DPP provides consumers with access to detailed information about the products they buy, such as their origin, materials and production practices. This allows consumers to make more informed choices and opt for products that are sustainable and ethical. For example, a consumer can choose a garment made from organic materials and produced under fair labour conditions, thus promoting responsible and conscious consumption.

Similarly, one of the great benefits of the DPP is the inclusion of repair guides within the digital passport. This means that consumers can easily access detailed instructions on how to repair a product instead of discarding it when it breaks down. For example, if an appliance stops working, the DPP can provide a step-by-step repair manual, allowing the user to fix it himself or take it to a technician with the necessary information. This not only extends the lifetime of products, but also reduces e-waste and promotes sustainability.

Also, access to detailed and transparent product information through the DPP can increase consumers' trust in brands. Companies that provide a complete and accurate DPP demonstrate their commitment to transparency and accountability, which can enhance their reputation and build customer loyalty. In addition, consumers who have access to this information are better able to make responsible purchasing decisions, thus encouraging more ethical and sustainable consumption habits.

Finally, the DPP facilitates effective recycling by providing clear information on how to break down and reuse the materials in a product. For example, a citizen who wishes to recycle an electronic device can consult the DPP to find out which parts can be recycled and how to separate them properly. This improves the efficiency of the recycling process and ensures that more materials are recovered and reused instead of ending up in landfill, contributing to a circular economy.

Risks and challenges of the DPP

Similarly, as a novel technology and as part of the digital transformation that is taking place in the product sectors, the DPP also presents certain challenges, risks and challenges such as:

  1. Data Protection: The collection and storage of large amounts of data can put consumers' privacy at risk if not properly managed.
  2. Security: Digital data is vulnerable to cyber-attacks, which requires robust security measures.
  3. Interoperability: Standardisation of data across different industries and countries can be complex, making it difficult to implement the DPP on a large scale.
  4. Costs: Creating and maintaining digital passports can be costly, especially for small and medium-sized enterprises.

Data protection implications

The implementation of the DPP and systems such as CIRPASS implies careful management of personal data. It is essential that companies and digital platforms comply with data protection regulations, such as the EU's General Data Protection Regulation (GDPR). Organisations must ensure that the data collected is used in a transparent manner and with the explicit consent of consumers. In addition, advanced security measures must be implemented to protect the integrity and confidentiality of the data.

Relationship with European Data Spaces

The European Data Spaces are an EU initiative to create a single market for data, promoting innovation and the digital economy. The DPP and CIRPASS are aligned with this vision, as they encourage the exchange of information between different actors in the economy. Data interoperability is essential for the success of the European Data Spaces, and the DPP can contribute significantly to this goal by providing structured and accessible product data.

Conclusion

In conclusion, the Digital Product Passport and the CIRPASS initiative represent a significant step towards a more circular and sustainable economy. Through the collection and exchange of detailed product data, these systems can improve transparency, encourage responsible consumption practices and reduce environmental impact. However, their implementation requires overcoming challenges related to data protection, security and interoperability. As we move towards a more digitised future, the DPP and CIRPASS have the potential to transform the way we interact with products and contribute to a more sustainable world.


Content prepared by Dr. Fernando Gualo, Professor at UCLM and Data Governance and Quality Consultant The content and the point of view reflected in this publication are the sole responsibility of its author.

calendar icon
Noticia

The European Parliament's tenth parliamentary term started on July, a new institutional cycle that will run from 2024-2029. The President of the European Commission, Ursula von der Leyen, was elected for a second term, after presenting to the European Parliament her Political Guidelines for the next European Commission 2024-2029.

These guidelines set out the priorities that will guide European policies in the coming years. Among the general objectives, we find that efforts will be invested in:

  1. Facilitating business and strengthening the single market.
  2. Decarbonise and reduce energy prices.
  3. Make research and innovation the engines of the economy.
  4. Boost productivity through the diffusion of digital technology.
  5. Invest massively in sustainable competitiveness.
  6. Closing the skills and manpower gap.

In this article, we will explain point 4, which focuses on combating the insufficient diffusion of digital technologies. Ignorance of the technological possibilities available to citizens limits the capacity to develop new services and business models that are competitive on a global level.

Boosting productivity with the spread of digital technology

The previous mandate was marked by the approval of new regulations aimed at fostering a fair and competitive digital economy through a digital single market, where technology is placed at the service of people. Now is the time to focus on the implementation and enforcement of adopted digital laws.

One of the most recently approved regulations is the Artificial Intelligence (AI) Regulation, a reference framework for the development of any AI system. In this standard, the focus was on ensuring the safety and reliability of artificial intelligence, avoiding bias through various measures including robust data governance.

Now that this framework is in place, it is time to push forward the use of this technology for innovation. To this end, the following aspects will be promoted in this new cycle:

  • Artificial intelligence factories. These are open ecosystems that provide an infrastructure for artificial intelligence supercomputing services. In this way, large technological capabilities are made available to start-up companies and research communities.
  • Strategy for the use of artificial intelligence. It seeks to boost industrial uses in a variety of sectors, including the provision of public services in areas such as healthcare. Industry and civil society will be involved in the development of this strategy.
  • European Research Council on Artificial Intelligence. This body will help pool EU resources, facilitating access to them.

But for these measures to be developed, it is first necessary to ensure access to quality data. This data not only supports the training of AI systems and the development of cutting-edge technology products and services, but also helps informed decision-making and the development of more accurate political and economic strategies. As the document itself states " Access to data is not only a major driver for competitiveness, accounting for almost 4% of EU GDP, but also essential for productivity and societal innovations, from personalised medicine to energy savings”.

To improve access to data for European companies and improve their competitiveness vis-à-vis major global technology players, the European Union is committed to "improving open access to data", while ensuring the strictest data protection.

The European data revolution

"Europe needs a data revolution. This is how blunt the President is about the current situation. Therefore, one of the measures that will be worked on is a new EU Data Strategy. This strategy will build on existing standards. It is expected to build on the existing strategy, whose action lines include the promotion of information exchange through the creation of a single data market where data can flow between countries and economic sectors in the EU.

In this framework, the legislative progress we saw in the last legislature will continue to be very much in evidence:

The aim is to ensure a "simplified, clear and coherent legal framework for businesses and administrations to share data seamlessly and at scale, while respecting high privacy and security standards".

In addition to stepping up investment in cutting-edge technologies, such as supercomputing, the internet of things and quantum computing, the EU plans to continue promoting access to quality data to help create a sustainable and solvent technological ecosystem capable of competing with large global companies. In this space we will keep you informed of the measures taken to this end.

calendar icon
Blog

The European Drug Report provides a current overview of the drug situation in the region, analysing the main trends and emerging threats. It is a valuable publication, with a high number of downloads, which is quoted in many media outlets.

The report is produced annually by the European Union Drugs Agency (EUDA), the current name of the former European Monitoring Centre for Drugs and Drug Addiction. It collects and analyses data from EU Member States, together with other partner countries such as Turkey and Norway, to provide a comprehensive picture of drug use and supply, drug harms and harm reduction interventions. The report contains comprehensive datasets on these issues disaggregated at the national level, and even, in some cases, at the city level (such as Barcelona or Palma de Mallorca).

This study has been carried out since 1993 and translated into more than 20 official languages of the European Union. However, in the last two years it has introduced a new feature: a change in internal processes to improve the visualisation of the data obtained. A process they explained in the recent webinar "The European Drug Report: using an open data approach to improve data visualisation", organised by the European Open Data Portal (data.europa.eu) on 25 June. The following is a summary of what the Observatory's representatives had to say at this event.

The need for change

The Observatory has always worked with open data, but there were inefficiencies in the process. Until now, the European Drug Report has always been published in PDF format, with the focus on achieving a visually appealing product. The internal process leading up to the publication of the report consisted of several stages involving various teams: 

  1. A team from the Observatory checked the format of the data received from the supplier and, if necessary, adapted it.
  2. A specialised data analysis team created visualisations from the data.
  3. A specialised drafting team drafted the report. The team that had created the visualisations could collaborate in this phase.
  4. An internal team validated the content of the report.
  5. The data provider checked that the Observatory had interpreted the data correctly.

Despite the good reception of the report and its format, in 2022 the Observatory decided to completely change the publication format for the following reasons:

  • Once the various steps of the publication process had been initiated, the data were formatted and were no longer machine-readable. This reduced the accessibility of the data, e.g. for screen readers, and limited its reusability.
  • If errors were detected in the different steps of the process, they were corrected directly on the format of the data in this step. In other words, if an error was detected in a chart during the revision phase, it was corrected directly on that chart. This procedure could cause errors and dull the traceability of data, limiting efficiency: the same static graph could be present several times in the document and each mention had to be corrected individually.
  • At the end of the process, the format of the source data had to be adjusted due to changes in the publication procedure.
  • Many of the users who consulted the report did so from a mobile device, for which the PDF format was not always suitable.
  • Because they are neither accessible nor mobile-friendly, PDF documents did not usually appear as the first result in search engines. This point is important for the Observatory, as many users find the report through search engines.

A responsive web format was needed, which automatically adjusts a website to the size and layout of its users' devices.  The aim was to:

  • Improved accessibility.
  • A more streamlined process for creating visualisations.
  • An easier translation process.
  • An increase in visitors from search engines.
  • Greater modularity.

The process behind the new report

In order to completely transform the publication format of the report, an ad hoc visualisation process has been carried out, summarised in the following image:

Process for creating visualizations for the European Drug Report. The user accesses the web page. The web server returns the page in html.  Browser downloads all necessary files, including the data visualization library.  The visualization library inspects the web page for “chart parameters”, downloads the data and creates a JS object that can be understood by HighCharts (or another charting library).  HighCharts creates the charts.  Source:  Webinar “The European Drug Report using an open data approach to improve data visualisation”, organized by data.europa.eu.

Figure 1. Process for creating visualizations for the European Drug Report. Source EN: Webinar “The European Drug Report using an open data approach to improve data visualisation”, organized by data.europa.eu.

The main new feature is that visualisations are created dynamically from the source data. In this way, if something is changed in these data, it is automatically changed in all visualisations that feed on it. Using the Drupal content management system, on which much of the site is based, administrators can register changes that will automatically be reflected in the HTML and therefore in the displays. In addition, site administrators have a visualisation generator which, based on data and indications - equivalent to simple instructions such as "sort from highest to lowest" expressed in HTML - creates visualisations without the need to touch code.

The same dynamic update procedure applies to the PDF that the user can download. If there are changes in the data, in the visualisations or if typographical errors are corrected, the PDF is generated again through a compilation process that the Observatory has created specifically for this task.

The report after the change

The report is currently published in HTML version, with the possibility to download chapters or the full report in PDF format. It is structured by thematic modules and also allows the consultation of annexes.

Furthermore, the data are always published in CSV format and the licensing conditions of the data (CC-BY-4.0) are indicated on the same page. The reference to the source of the data is always made available to the reader on the same page as a visualisation.

With this change in procedure and format, benefits for all have been achieved. From the readers' point of view, the user experience has been improved. For the organisation, the publication process has been streamlined.

In terms of open data, this new approach allows for greater traceability, as the data can be consulted at any time in its current format. Moreover, according to the Observatory speakers, this new format of the report, together with the fact that the data and visualisations are always up-to-date, has increased the accessibility of the data for the media.

You can access the webinar materials here:

calendar icon
Blog

The publication on Friday 12 July 2024 of the Artificial Intelligence Regulation (AIA) opens a new stage in the European and global regulatory framework. The standard is characterised by an attempt to combine two souls. On the one hand, it is about ensuring that technology does not create systemic risks for democracy, the guarantee of our rights and the socio-economic ecosystem as a whole. On the other hand, a targeted approach to product development is sought in order to meet the high standards of reliability, safety and regulatory compliance defined by the European Union.

Scope of application of the standard

The standard allows differentiation between low-and medium-risk systems, high-risk systems and general-purpose AI models. In order to qualify systems, the AIA defines criteria related to the sector regulated by the European Union (Annex I) and defines the content and scope of those systems which by their nature and purpose could generate risks (Annex III). The models are highly dependent on the volume of data, their capacities and operational load. 

 AIA only affects the latter two cases: high-risk systems and general-purpose AI models. High-risk systems require conformity assessment through notified bodies. These are entities to which evidence is submitted that the development complies with the AIA. In this respect, the models are subject to control formulas by the Commission that ensure the prevention of systemic risks. However, this is a flexible regulatory framework that favours research by relaxing its application in experimental environments, as well as through the deployment of sandboxes for development.

The standard sets out a series of "requirements for high-risk AI systems" (section two of chapter three) which should constitute a reference framework for the development of any system and inspire codes of good practice, technical standards and certification schemes. In this respect, Article 10 on "data and data governance" plays a central role. It provides very precise indications on the design conditions for AI systems, particularly when they involve the processing of personal data or when they are projected on natural persons.

This governance should be considered by those providing the basic infrastructure and/or datasets, managing data spaces or so-called Digital Innovation Hubs, offering support services. In our ecosystem, characterised by a high prevalence of SMEs and/or research teams, data governance is projected on the quality, security and reliability of their actions and results. It is therefore necessary to ensure the values that AIA imposes on training, validation and test datasets in high-risk systems, and, where appropriate, when techniques involving the training of AI models are employed.

These values can be aligned with the principles of Article 5 of the General Data Protection Regulation (GDPR) and enrich and complement them. To these are added the risk approach and data protection by design and by default. Relating one to the other is ancertainly interesting exercise.

Ensure the legitimate origin of the data. Loyalty and lawfulness

Alongside the common reference to the value chain associated with data, reference should be made to a 'chain of custody' to ensure the legality of data collection processes. The origin of the data, particularly in the case of personal data, must be lawful, legitimate and its use consistent with the original purpose of its collection. A proper cataloguing of the datasets at source is therefore indispensable to ensure a correct description of their legitimacy and conditions of use.

This is an issue that concerns open data environments, data access bodies and services detailed in the Data Governance Regulation (DGA ) or the European Health Data Space (EHDS) and is sure to inspire future regulations. It is usual to combine external data sources with the information managed by the SME.

Data minimisation, accuracy and purpose limitation

AIA mandates, on the one hand, an assessment of the availability, quantity and adequacy of the required datasets. On the other hand, it requires that the training, validation and test datasets are relevant, sufficiently representative and possess adequate statistical properties. This task is highly relevant to the rights of individuals or groups affected by the system. In addition, they shall, to the greatest extent possible, be error-free and complete in view of their intended purpose. AIA predicates these properties for each dataset individually or for a combination of datasets.

In order to achieve these objectives, it is necessary to ensure that appropriate techniques are deployed:

  • Perform appropriate processing operations for data preparation, such as annotation, tagging, cleansing, updating, enrichment and aggregation.
  • Make assumptions, in particular with regard to the information that the data are supposed to measure and represent. Or, to put it more colloquially, to define use cases.
  • Take into account, to the extent necessary for the intended purpose, the particular characteristics or elements of the specific geographical, contextual, behavioural or functional environment in which the high-risk AI system is intended to be used.

Managing risk: avoiding bias 

In the area of data governance, a key role is attributed to the avoidance of bias where it may lead to risks to the health and safety of individuals, adversely affect fundamental rights or give rise to discrimination prohibited by Union law, in particular where data outputs influence incoming information for future operations. To this end, appropriate measures should be taken to detect, prevent and mitigate possible biases identified.

The AIA exceptionally enables the processing of special categories of personal data provided that they offer adequate safeguards in relation to the fundamental rights and freedoms of natural persons. But it imposes additional conditions:

  • the processing of other data, such as synthetic or anonymised data, does not allow effective detection and correction of biases;
  • that special categories of personal data are subject to technical limitations concerning the re-use of personal data and to state-of-the-art security and privacy protection measures, including the pseudonymisation;
  • that special categories of personal data are subject to measures to ensure that the personal data processed are secured, protected and subject to appropriate safeguards, including strict controls and documentation of access, to prevent misuse and to ensure that only authorised persons have access to such personal data with appropriate confidentiality obligations;
  • that special categories of personal data are not transmitted or transferred to third parties and are not otherwise accessible to them;
  • that special categories of personal data are deleted once the bias has been corrected or the personal data have reached the end of their retention period, whichever is the earlier;
  • that the records of processing activities under Regulations (EU) 2016/679 and (EU) 2018/1725 and Directive (EU) 2016/680 include the reasons why the processing of special categories of personal data was strictly necessary for detecting and correcting bias, and why that purpose could not be achieved by processing other data.

The regulatory provisions are extremely interesting. RGPD, DGA or EHDS are in favour of processing anonymised data. AIA makes an exception in cases where inadequate or low-quality datasets are generated from a bias point of view.

Individual developers, data spaces and intermediary services providing datasets and/or platforms for development must be particularly diligent in defining their security. This provision is consistent with the requirement to have secure processing spaces in EHDS, implies a commitment to certifiable security standards, whether public or private, and advises a re-reading of the seventeenth additional provision on data processing in our Organic Law on Data Protection in the area of pseudonymisation, insofar as it adds ethical and legal guarantees to the strictly technical ones.  Furthermore, the need to ensure adequate traceability of uses is underlined. In addition, it will be necessary to include in the register of processing activities a specific mention of this type of use and its justification.

Apply lessons learned from data protection, by design and by default

Article 10 of AIA requires the documentation of relevant design decisions and the identification of relevant data gaps or deficiencies that prevent compliance with AIA and how to address them. In short, it is not enough to ensure data governance, it is also necessary to provide documentary evidence and to maintain a proactive and vigilant attitude throughout the lifecycle of information systems.

These two obligations form the keystone of the system. And its reading should even be much broader in the legal dimension. Lessons learned from the GDPR teach that there is a dual condition for proactive accountability and the guarantee of fundamental rights. The first is intrinsic and material: the deployment of privacy engineering in the service of data protection by design and by default ensures compliance with the GDPR. The second is contextual: the processing of personal data does not take place in a vacuum, but in a broad and complex context regulated by other sectors of the law.

Data governance operates structurally from the foundation to the vault of AI-based information systems. Ensuring that it exists, is adequate and functional is essential.  This is the understanding of the Spanish Government's Artificial Intelligence Strategy 2024  which seeks to provide the country with the levers to boost our development.

AIA makes a qualitative leap and underlines the functional approach from which data protection principles should be read by stressing the population dimension. This makes it necessary to rethink the conditions under which the GDPR has been complied with in the European Union. There is an urgent need to move away from template-based models that the consultancy company copies and pastes. It is clear that checklists and standardisation are indispensable. However, its effectiveness is highly dependent on fine tuning. And this calls particularly on the professionals who support the fulfilment of this objective to dedicate their best efforts to give deep meaning to the fulfilment of the Artificial Intelligence Regulation.  

You can see a summary of the regulations in the following infographic:

Captura de la infografía

You can access the accessible and interactive version here

Content prepared by Ricard Martínez, Director of the Chair of Privacy and Digital Transformation. Professor, Department of Constitutional Law, Universitat de València. The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon
Blog

For some time now we have been hearing about high-value dataset, those datasets whose re-use is associated with considerable benefits for society, the environment and the economy. They were announced in Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information, and subsequently defined in Commission Implementing Regulation (EU) 2023/138 of 21 December 2022 establishing a list of specific high-value datasets and modalities for publication and re-use.

In particular, six categories of dataset are concerned: geospatial, Earth observation and environment, meteorology, statistics, companies and company ownership, and mobility. The detail of these categories and how these datasets should be opened is summarised in the following infographic:

Infographic-summary on high-value datasets. Version accessible by clicking.

Click on the image or here to expand and access the accessible version

For years, even before the publication of Directive (EU) 2019/1024,Spanish organisations have been working to make this type of datasets available to developers, companies and any citizen who wants to use them, with technical characteristics that facilitate their reuse. However, the Regulation has laid down a number of specific requirements to be met.  Below is a summary of the progress made in each category.

Geospatial data

For geospatial data, the implementing regulation (EU) 2023/138 takes into account the categories indicated in Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community (INSPIRE), with the exception of agricultural and reference parcels, for which Regulation (EU) 2021/2116 of the European Parliament and of the Council of 2 December 2021applies.

Spain has complied with the INSPIRE Directive for years, thanks to the  Law 14/2010 of 5 July 2010 on geographic information infrastructures and services in Spain (LISIGE), which transposes the Directive. Citizens have at their disposal the Official Catalogue of INSPIRE Data and Services of Spain, as well as the catalogues of the Spatial Data Infrastructures of the Autonomous Communities. This has resulted in comprehensive geographical coverage, with exhaustive metadata, which complies with European requirements.

  • You can see the dataset currently published by our country in this category on the INSPIRE Geoportal. You can read more about it in this post.

Earth observation and environmental data

For the category of Earth observation and environment data, both the environmental and climate datasets listed in the annexes of the INSPIRE Directive and those produced in the context of a number of legal acts known as priority data, detailed in the Implementing Regulation, are taken into account.

As with the previous category, the fact of having the LISIGE law, which develops INSPIRE and goes further in the obligations set out, has meant that many of these datasets were already available prior to the Implementing Regulation.

  • You can see the dataset currently published by Spain in this category in the INSPIRE Geoportal and read more about its publication in Spain here.

Meteorological data

The meteorological thematic category encompasses collections of data on observations measured by various elements, such as weather stations, radars, etc.

In Spain, the State Meteorological Agency (AEMET) has a portal, AEMET OpenData, which was a pioneer in Europe in terms of the availability of open meteorological data. In this portal we find that most of the high-value datasets are already available, grouped in the 14 categories of AEMET OpenData. Work is ongoing to expand the available datasets, their granularity and other technical aspects to further enhance their usability.

  • You can see a more detailed review of the current status of the publication status of the datasets in this category in this post.

Statistical data

High statistical value data are covered by a number of legal acts detailed in the Annex to the Implementing Regulation. This category is based on the  European Statistical System, which ensures quality and interoperability between states.

In line with this system, Spain has the National Statistical Plan. This plan is developed and implemented through specific annual programmes detailing statistical operations, their objectives, bodies involved and budget appropriations, many of which are aligned with the statistical packages detailed in the Implementing Regulation.

  • You can see the detail of the equivalence between the high value data and the datasets published as the result of the National Statistical Plan in this article. You can also see the details of the data published by the National Statistics Institute (INE) here.

Company data and company ownership

Company and company ownership data refer to datasets containing basic company information, including company documents and accounts.

In Spain, information from the Official Gazette of the Mercantile Registry (BORME in Spanish acronyms) is offered openly, with temporal coverage since 2009. However, work continues on opening up more datasets in this category.

Mobility data

The mobility category includes datasets falling under the domain "Transport Networks", included in Annex I of the INSPIRE Directive, together with those referred to in Directive 2005/44/EC of the European Parliament and of the Council of 7 September 2005 on harmonised River Information Services (RIS) on inland waterways in the Community.

As was the case for other categories where high value dataset were already covered by the INSPIRE Directive, Spain has a large amount of dataset available on the Geoportal of the Spatial Data Infrastructure of Spain (IDEE) and the infrastructures of the Autonomous Communities and the infrastructures of the Autonomous Communities.

 

The large amount of dataset published reflects our country's continued commitment to transparency and access to high-value dataset. This is an ongoing effort, the result of the collaboration and involvement of various organisations. Work continues to provide the public with as much quality data as possible.

calendar icon
Blog

The cross-cutting nature of open data on weather and climate data has favoured its use in areas as diverse as precision agriculture, fire prevention or the precision forestry. But the relevance of these datasets lies not only in their direct applicability across multiple industries, but also in their contribution to the challenges related to climate change and environmental sustainability challenges related to climate change and environmental sustainability, which the different action lines of the which the different action lines of the European Green Pact seek to address.

Meteorological data are considered by the European Commission, high value data in accordance with the annex to Regulation 2023/138. In this post we explain which specific datasets are considered to be of high value and the level of availability of this type of data in Spain.

The State Meteorological Agency

In Spain, it corresponds to the State Agency for Meteorology (AEMET) the mission of providing meteorological and climatological services at national level. As part of the Ministry for Ecological Transition and the Demographic Challenge. AEMET leads the related activities of observation, prediction and study of meteorological and climatic conditions, as well as research related to these fields. Its mission includes the provision and dissemination of essential information and forecasts of general interest. This information can also support relevant areas such as civil protection, air navigation, national defence and other sectors of activity.

In order to fulfil this mission, AEMET manages an open data portal that enables the reuse by natural or legal persons, for commercial or non-commercial purposes, of part of the data it generates, prepares and safeguards in the performance of its functions. This portal, known as AEMET OpenData currently offers two modalities for accessing and downloading data in reusable formats:

  • General access, which consists of graphical access for the general public through human-friendly interfaces.
  • AEMET OpenData API, designed for periodic or scheduled interactions in any programming language, which allows developers to include AEMET data in their own information systems and applications.

In addition, in accordance with Regulation 2023/138, it is envisaged to enable a third access route that would allow re-users to obtain packaged datasets for mass downloading where possible.

In order to access any of the datasets, an access key (API Key) which can be obtained through a simple request in which only an e-mail address is required, without any additional data from the applicant, for the sending of the access key. This is a control measure to ensure that the service is provided with adequate quality and in a non-discriminatory manner for all users.

AEMET OpenData also pioneered the availability of open meteorological data in Europe, reflecting AEMET''s commitment to the continuous improvement of meteorological services, support to the scientific and technological community, and the promotion of a more informed and resilient society in the face of climate challenges.

High-value meteorological datasets

The Annex to Regulation (EU) 2023/138 details five high-value meteorological data sets: weather station observations, validated weather data observations, weather warnings, radar data and numerical prediction model (NMP) data. For each of the sets, the regulation specifies the granularity and the main attributes to be published.

If we analyse the correspondence of the datasets that are currently available grouped in 14 categories in the portal AEMET OpenData portal, with the five datasets that will become mandatory in the coming months, we obtain the conclusions summarised in the following table:

High-value meteorological datasets Equivalence in the AEMET OpenData datasets
Observation data measured by meteorological stations The "Conventional Observation" dataset, generated by the Observing Service, provides a large number of hourly variables on liquid and solid precipitation, wind speed and direction, humidity, pressure, air, soil and subsoil temperature, visibility, etc. It is updated twice an hour. In accordance with the Regulation, ten-minute data shall be included with continuous updating.
Climate data: validated observations Within the category "Climatological Values", four datasets on climate data observations are provided: "Daily climatologies", "Monthly/annual climatologies", "Normal values" and "Recorded extremes". The validated dataset provided by the National Climatological Data Bank Service is normally updated once a day with a delay of four days due to validation processes. Attributes available include daily mean temperature, daily precipitation in its standard 07:00 to 07:00 measurement form, daily mean relative humidity, maximum gust direction, etc. In accordance with the Regulation, the inclusion of hourly climatology is planned.
Weather warnings Adverse weather warnings" are provided for the whole of Spain, or segmented by province or Autonomous Community. Both the latest issued and the historical ones since 2018. They provide data on observed and/or forecast severe weather events, from the present time until the next 72 hours. These warnings refer to each meteorological parameter by warning level, for each weather zone defined in the Meteoalert Plan. It is generated by the Adverse Events Functional Groups and the information is available any time an adverse weather event is issued, in line with the Regulation, which requires the dataset to be published "as issued or hourly". In this case, AEMET announces preferential broadcasting hours: 09:00, 11:30, 23:00 y 23:50.
Radar data There are two sets of data: "Regional radar graphic image" and "National radar composition image", which provide reflectivity images, but not the others described in the Regulation (backscatter, polarisation, precipitation, wind and echotop). The dataset is generated by the Land Remote Sensing group and the information is available at a periodicity of 10 minutes instead of the 5 minutes recommended in the Regulation. However, according to the Strategic Plan 2022-2025 of the AEMET the updating of the 15 weather radars and the incorporation of new radars with higher resolution is foreseen, so that in addition to strengthening the early warning system, the obligations of the Regulation can be fulfilled.
PMN model data There are several datasets with forecast information, some available for download and some available on the web: weather forecast, normalised text forecast, specific forecasts, maritime forecast and maps of weather variables maps of the HARMONIE-AROME numerical models for different geographical areas and time periods. However, the AEMET, according to their frequently asked questions document does not currently consider numerical model outputs as open data. AEMET offers the possibility of requesting this or any other dataset through the general register  or through the electronic site but this is not an option provided for in the Regulation. In line with this, the inclusion of numerical atmospheric and wave model outputs is foreseen.

Figure 1: Table showing the equivalence between high value datasets and AEMET OpenData datasets.

The regulation also sets out a number of requirements for publication in terms of format, licence granted, frequency of updating and timeliness, means of access and metadata provided.

In the case of metadata, AEMET publishes, in machine-readable format, the main characteristics of the downloaded file: who prepares it, how often it is prepared, what it contains and its format, as well as information on the data fields (meteorological variable, unit of measurement, etc.). The copyright and terms of use are also specified by means of the legal notice. In this regard, it is foreseen that the current licences will be reviewed to make the datasets available under a licensing scheme compliant with the Regulation, possibly following the recommendation by adopting the license CC BY-SA 4.0.

All in all, it seems that the long track record of the State Meteorological Agency (AEMET) in providing quality open data has put it in a good position to comply with the requirements of the new regulation, making some adjustments to the datasets it already offers through AEMET OpenData to align them with the new obligations. AEMET plans to include in this service the datasets required by the Regulation and which are currently not available, as it adapts its regulations on public prices, as well as the infrastructure and systems that make this possible. Additional datasets that will be available will be ten-minute observation data, hourly climatologies and some data parameters from regional radars and numerical wave and forecast models.


Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The contents and views reflected in this publication are the sole responsibility of the author.

calendar icon
Noticia

The European open data portal (data.europa.eu) regularly organises virtual training sessions on topical issues in the open data sector, the regulations they affect and related technologies. In this post, we review the key takeaways from the latest webinar on High Value Datasets (HVD).

Among other issues, this seminar focused on transmitting best practices, as well as explaining the experiences of two countries, Finland and the Czech Republic, which were part of the report "High-value Datasets Best Practices in Europe", published by data.europa.eu, together with Denmark, Estonia, Italy, the Netherlands and Romania. The study was conducted immediately after the publication of the HVD implementation regulation in February 2023.

Best practices linked to the provision of high-value data

After an introduction explaining what high-value data are and what requirements they have to meet, the scope of the report was explained in detail during the webinar. In particular, challenges, good practices and recommendations from member states were identified, as detailed below.

Political and legal framework

  • There is a need to foster a government culture that is primarily practical and focused on achievable goals, building on cultural values embedded in government systems, such as transparency.
  • A strategic approach based on a broader regulatory perspective is recommended, building on previous efforts to implement far-reaching directives such as INSPIRE or DCAT as a standard for data publication. In this respect, it is appropriate to prioritise actions that overlap with these existing initiatives.
  • The use of Creative Commons (CC) licences is recommended.
  • On a cross-cutting level, another challenge is to combine compliance with the requirements of high-value datasets with the provisions of the General Data Protection Regulation (GDPR), when dealing with sensitive or personal data.

Governance and processes

  • Engaging in strategic partnerships and fostering collaboration at national level is encouraged. Among other issues, it is recommended to coordinate efforts between ministries, agencies responsible for different categories of HVD and other related actors, especially in Member States with decentralised governance structures. To this end, it is important to set up interdisciplinary working groups to facilitate a comprehensive data inventory and to clarify which agency is responsible for which dataset. These groups will enable knowledge sharing and foster a sense of community and shared responsibility, which contributes to the overall success of data governance efforts.
  • It is recommended to engage in regular exchanges with other Member States, to share ideas and solutions to common challenges.
  • There is a need to promote sustainability through the individual accountability of agencies for their respective datasets. Ensuring the sustainability of national data portals means making sure that metadata is maintained with the resources available.
  • It is advisable to develop a comprehensive data governance framework by first assessing available resources, including technical expertise, data management tools and key stakeholder input. This assessment process allows for a clear understanding of the rules, processes and responsibilities necessary for an effective implementation of data governance.

Technical aspects, metadata quality and new requirements

  • It is proposed to develop a comprehensive understanding of the specific requirements for HVD. This involves identifying existing datasets to determine their compliance with the standards described in the implementing regulation for HVD. There is a need to build a systemic basis for identifying, improving the quality and availability of data by enhancing the overall value of high-value datasets.
  • It is recommended to improve the quality of metadata directly at the data source before publishing them in portals, following the DCAT-AP guidelines for publishing high-value datasets and the controlled vocabularies for the six HVD categories. There is also a need to improve the implementation of APIs and bulk downloads from each data source. Its implementation presents significant challenges due to the scarcity of resources and expertise, making capacity building and resourcing essential.
  • It is suggested to strengthen the availability of high-value datasets through external funding or strategic planning. The regulation requires all HVD to be accessible free of charge, so some Member States diversify funding sources by seeking financial support through external channels, e.g. by tapping into European projects. In this respect, it is recommended to adapt business models progressively to offer free data.

Finally, the report highlights a suggested eight-step roadmap for compliance with the HVD implementation regulation:

Suggested HVD implementation´s regulation compliance roadmap. 1. Develop a detailed compliance plan 2. Establish cross-departamental working groups 3. Conduct a comprehensive inventory 4.Enhace metadata quality ans standarisation 5. Update data distribution practices  6. Collaborate with European Commission and peers 7. Monitor and evaluate progress 8. Provide ongoing trainning and support. Source: adaptation of figure 3 of the "High-value Datasets Best Practices in Europe report", by the European Data Portal.

Figure 1: Suggested roadmap for HVD implementation. Adapted from Figure 3 of the European Data Portal's "High-value Datasets Best Practices Report".

The example of the Czech Republic

In a second part of the webinar, the Czech Republic presented their implementation case, which they are approaching from four main tasks: motivation, regulatory implementation, responsibility of public data provider agencies and technical requirements.

  • Motivation among the different actors is being articulated through the constitution of working groups.
  • Regulatory implementation focuses on dataset analysis and consistency or inconsistency with INSPIRE.
  • To boost the accountability of public agencies, knowledge-sharing seminars are being held on linking INSPIRE and HVD using the DCAT-AP standard as a publication pathway.
  • Regarding technical requirements, DCAT-AP and INSPIRE requirements are being integrated into metadata practices adapted to their national context. The Czech Republic has developed specifications for local open data catalogues to ensure compatibility with the National Open Data Catalogue. However, its biggest challenge is a strong dependency due to a lack of technical capacities. 

The example of Finland

Finland then took the floor. Having pre-existing legislation (INSPIRE and other specific rules on open data and information management inpublic bodies), Finland required only minor adjustments to align with the national transposition of the HVD directive. The challenge is to understand and make INSPIRE and HVD coexist.

Its main strategy is based on the roadmap on information management in public bodies, which ensures harmonisation, interoperability, high quality management and security to implement the principles of open data. In addition, they have established two working groups to address the implementation of HVD:

  • The first group, which is a coordinating group of data promoters, focused on practical and technical issues. As legal experts, they also provided guidance on understanding HVD regulation from a legal perspective.
  • The second group is an inter-ministerial coordination group, a working group that ensures that there is no conflict or overlap between HVD regulation and national legislation. This group manages the inventory, in spreadsheet format, containing all the elements necessary for an HVD catalogue. By identifying areas where datasets do not meet these requirements, organisations can establish a roadmap to address the gaps and ensure full compliance over time.

The secretariat of the groups is provided by a geospatial data committee. Both have a wide network of stakeholders to articulate discussion and feedback on the measures taken.

Looking to the future, they highlight as a challenge the need to gain more technical and executive level experience.

End of the session

The webinar continued with the participation of Compass Gruppe (Germany), which markets, among other things, data from the Austrian commercial register. They have a portal that offers this data via APIs through a freemium business model.  

In addition, it was recalled that Member States are obliged to report to Europe every two years on progress in HVD, an activity that is expected to boost the availability of harmonised federated metadata on the European data portal. The idea is that users will be able to find all HVD in the European Union, using the filtering available on the portal or through SPARQL queries.

The combination of the report's conclusions and the experiences of the rapporteur countries give us good clues to guide the implementation of HVD, in compliance with European regulations. In summary, the implementation of HVD poses the following challenges:

  • Support the necessary funding to address the opening-up process.
  • Overcoming technical challenges to develop efficient access APIs.
  • Achieving a proper coexistence between INSPIRE and the HVD regulation
  • Consolidate working groups that function as a robust mechanism for progress and convergence.
  • Monitor progress and continuously follow up the process.
  • Invest in technical training of staff.
  • Create and maintain strong coordination in the face of the complex diversity of data holders.
  • Potential quality assurance of high value datasets.
  • Agree on a standardisation that is necessary from a business point of view.

By addressing these challenges, we will successfully open up high-value data, driving its re-use for the benefit of society as a whole.

You can re-watch the recording of the session here

calendar icon
Blog

Spain, as part of the European Union, is committed to the implementation of the European directives on open data and re-use of public sector information. This includes the adoption of initiatives such as the Implementing Regulation (EU) 2023/138 issued by the European Commission, which defines specific guidelines for government entities with regard to the availability of High value datasets (HVD). These data are categorised into themes previously detailed in earlier discussions: Geospatial, Earth Observation and Environment, Meteorology, Statistics, Societies and Societal Properties, and Mobility. In this article we will focus on the last group mentioned.

The Mobility category encompasses data collections falling under the domain of "Transport Networks", as demarcated in Annex I of the Directive 2007/2/EC of the European Parliament and of the Council of 14 March 2007 establishing an Infrastructure for Spatial Information in the European Community (INSPIRE). In particular, this Directive refers to the requirement to make available to users datasets relating to road, rail, air and inland waterway networks, with their associated infrastructure, connections between different networks and the trans-European transport network, as defined by Decision No 1692/96/EC of the European Parliament and of the Council of 23 July 1996 on Community guidelines for the development of the trans-European transport network.

In addition, it includes the datasets as described in the Directive 2005/44/EC of the European Parliament and of the Council of 7 September 2005 on harmonised River Information Services (RIS) on inland waterways in the Community. The main objective of the Directive is to improve inland waterway traffic and transport, and it applies to canals, rivers, lakes and ports capable of accommodating vessels of between 1,000 and 1,500 tonnes. These datasets include:

Data type Inland waterways datasets
Static data
  • Fairway characteristics
  • Long-time obstructions in the fairway and reliability
  • Rates of waterway infrastructure charges
  • Other physical limitations on waterways
  • Regular lock and bridge operating times
  • Location and characteristics of ports and transhipment sites
  • List of navigation aids and traffic signs
  • Navigation rules and recommendations
Dynamic data
  • Water depths contours in the navigation channel
  • Temporary obstructions in the fairway
  • Present and future water levels at gauges
  • State of the rivers, canals, locks and bridges
  • Restrictions caused by flood and ice
  • Short term changes of lock and bridge operating times
  • Short term changes of aids to navigation
Inland electronic and navigational charts (Inland ENC according to the Inland ECDIS Standard)
  • Waterway axis with kilometres indication
  • Links to the external xml-files with operation times of restricting structures
  • Location of ports and transhipment sites
  • Reference data for water level gauges relevant to navigation
  • Bank of waterway at mean water level
  • Shoreline construction
  • Contours of locks and dams
  • Boundaries of the fairway/navigation channel
  • Isolated dangers in the fairway/navigation channel under and above water
  • Official aids-to-navigation (e.g. buoys, beacons, lights, notice marks)

Figure 1: Table with the high value datasets related to Directive 2005/44/EC for the creation of a trans-European river information network.

In order for all of us to make the most of the information available, the Regulation defines some basic rules on how this data is shared:

  • Free and easy to use. The data must be ready to be used and shared with everyone for any purpose by acknowledging and citing the source of the data, as prescribed by the Creative Commons BY 4.0 licence.
  • Easy to read and use. Data will be presented in a way that both people and computers can easily understand them and everything will be explained in public.
  • Direct and easy access. There will be special ways (called APIs) that allow programs to access data automatically. In addition, the user can alternatively download a lot of information at once.
  • Always up to date. It is important that data is up to date, so there will be access to the most recent version. But if the user needs to access previous data, it will also be possible to view previous versions.
  • Detailed and precise. Data will be shared in as much detail as possible, to a very fine level of accuracy, so that the whole territory is covered when combined.
  • Information on information. There will be "information about the information" (metadata) that will tell everything about the data. The metadata shall contain at least the elements listed in the Annex to Commission Regulation (EC) No 1205/2008 of 3 December 2008.
  • Understandable and orderly: It will explain well how the data are organised and what all means, in a way that is easy for everyone to understand (structure and semantics).
  • Common language. Data shall use vocabularies, code lists and categories that are recognised and accepted at European or global level.

in Spain, who is responsible for the creation and maintenance of mobility data?

In Spain, the responsibility for the creation and maintenance of mobility data generally lies with different governmental entities, depending on the type of mobility and the territorial scope:

  • Level national level. The Ministry of Transport and Sustainable Mobility is the main body in charge of mobility in terms of infrastructure and transport at national level. This would include data on roads, railways, air and maritime transport.
  • Regional and local level. Autonomous communities and municipalities also play an important role in urban and regional mobility. They are responsible for urban mobility, public transport and public roads, within their respective jurisdictions.
  • Public business entities. There are entities such as ADIF (acronym for Administrador de Infraestructuras Ferroviarias, that is Railway Infrastructure Administrator), AENA (acronym for Aeropuertos Españoles y Navegación Aérea, that is Spanish Airports and Air Navigation), Puertos del Estado (State Ports) and others tentities hat manage specific data related to their field of action in rail, air and maritime transport, respectively.

In Spain, the Ministry of Transport and Sustainable Mobility, in collaboration with the autonomous communities, plays a key role in providing access to a wide range of mobility data. In compliance with INSPIRE and LISIGE (Law 14/2010 of 5 July 2010 on geographic information infrastructures and services in Spain, which transposes the INSPIRE Directive), it offers resources such as the Geoportal of the Spatial Data Infrastructure of Spain (IDEE in Spanish acronyms) where citizens and professionals can access geographic data and services, especially with regard to mobility.

Does Spain comply with the HVD Mobility Regulation?

To solve this question we have to go to the INSPIRE Geoportal  where official information classified as high value datasets in Europe is available. Specifically in the mobility category.

Inspire portal snapshot of high-value mobility data

Figure 2: Screenshot of the Inspire Geoportal.

As of April 2024 Spain has published the following information in the INSPIRE Geoportal:

  • Port service areas in Spain. The port service areas include the cartographic and alphanumeric information of the land service area and water areas I and II. The Spanish State-owned Port System is made up of 46 ports of general interest, managed by 28 Port Authorities.
  • Spanish Transport Networks. The Transport Network of the Geographic Reference Information of the National Cartographic System of Spain is a three-dimensional network of national coverage, defined and published in accordance with the INSPIRE Directive, which contemplates five modes of transport: road, rail, inland waterways, air and cable, together with their respective intermodal connections and the infrastructures associated with each mode. This information has the linear geometry of the roads and the punctual geometry of the portals and kilometre points.
  • ADIF''s Spanish Rail Transport Network. Public geographic dataset on the adaptation of the Spanish ADIF Common Traamification to the INSPIRE regulations (Transport Networks Annex I).

The publication of these high-value datasets responds positively to the question of Spain''s compliance with the HVD regulation, and is an achievement that reflects Spain''s continued commitment to transparency and access to mobility data.

The joint effort between the Ministry of Transport, Mobility and Urban Agenda, the National Cartographic System and the Autonomous Communities and Public Business Entities underlines the importance of a collaborative approach to mobility information management.

The availability of this data highlights Spain''s commitment to publishing high-value datasets and underlines the importance of continuously improving access to information to optimise inland navigation and mobility data.


Content prepared by Mayte Toscano, Senior Consultant in Data Economy Technologies. The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon