Noticia

 On 19 November, the European Commission presented the Data Union Strategy, a roadmap that seeks to consolidate a robust, secure and competitive European data ecosystem. This strategy is built around three key pillars: expanding access to quality data for artificial intelligence and innovation, simplifying the existing regulatory framework, and protecting European digital sovereignty. In this post, we will explain each of these pillars in detail, as well as the implementation timeline of the plan planned for the next two years.

Pillar 1: Expanding access to quality data for AI and innovation

The first pillar of the strategy focuses on ensuring that companies, researchers and public administrations have access to high-quality data that allows the development of innovative applications, especially in the field of artificial intelligence. To this end, the Commission proposes a number of interconnected initiatives ranging from the creation of infrastructure to the development of standards and technical enablers. A series of actions are established as part of this pillar: the expansion of common European data spaces, the development of data labs, the promotion of the Cloud and AI Development Act, the expansion of strategic data assets and the development of facilitators to implement these measures.

1.1 Extension of the Common European Data Spaces (ECSs)

Common European Data Spaces are one of the central elements of this strategy:

  • Planned investment: 100 million euros for its deployment.

  • Priority sectors: health, mobility, energy, (legal) public administration and environment.

  • Interoperability: SIMPL is committed  to interoperability between data spaces with the support of the European Data Spaces Support Center (DSSC).

  • Key Applications:

    • European Health Data Space (EHDS): Special mention for its role as a bridge between health data systems and the development of AI.

    • New Defence Data Space: for the development of state-of-the-art systems, coordinated by the European Defence Agency.

1.2 Data Labs: the new ecosystem for connecting data and AI development

The strategy proposes to use Data Labs as points of connection between the development of artificial intelligence and European data.

These labs employ data pooling, a process of combining and sharing public and restricted data from multiple sources in a centralized repository or shared environment. All this facilitates access and use of information. Specifically, the services offered by Data Labs are:

  • Makes it easy to access data.

  • Technical infrastructure and tools.

  • Data pooling.

  • Data filtering and labeling 

  • Regulatory guidance and training.

  • Bridging the gap between data spaces and AI ecosystems.

Implementation plan:

  • First phase: the first Data Labs will be established within the framework of AI Factories (AI gigafactories), offering data services to connect AI development with European data spaces.

  • Sectoral Data Labs: will be established independently in other areas to cover specific needs, for example, in the energy sector.

  • Self-sustaining model: It is envisaged that the Data Labs model  can be deployed commercially, making it a self-sustaining ecosystem that connects data and AI.

1.3 Cloud and AI Development Act: boosting the sovereign cloud

To promote cloud technology, the Commission will propose this new regulation in the first quarter of 2026. There is currently an open public consultation in which you can participate here.

1.4 Strategic data assets: public sector, scientific, cultural and linguistic resources

On the one hand, in 2026 it will be proposed to expand the list of high-value data  in English or HVDS to include legal, judicial and administrative data, among others. And on the other hand, the Commission will map existing bases and finance new digital infrastructure.

1.5 Horizontal enablers: synthetic data, data pooling, and standards

The European Commission will develop guidelines and standards on synthetic data and advanced R+D in techniques for its generation will be funded through Horizon Europe.

Another issue that the EU wants to promote is data pooling, as we explained above. Sharing data from early stages of the production cycle can generate collective benefits, but barriers persist due to legal uncertainty and fear of violating competition rules. Its purpose? Make data pooling a reliable and legally secure option to accelerate progress in critical sectors.

Finally, in terms of standardisation, the European standardisation organisations (CEN/CENELEC) will be asked to develop new technical standards in two key areas: data quality and labelling. These standards will make it possible to establish common criteria on how data should be to ensure its reliability and how it should be labelled to facilitate its identification and use in different contexts.

Pillar 2: Regulatory simplification

The second pillar addresses one of the challenges most highlighted by companies and organisations: the complexity of the European regulatory framework on data. The strategy proposes a series of measures aimed at simplifying and consolidating existing legislation.

2.1 Derogations and regulatory consolidation: towards a more coherent framework

The aim is to eliminate regulations whose functions are already covered by more recent legislation, thus avoiding duplication and contradictions. Firstly, the Free Flow of Non-Personal Data Regulation (FFoNPD) will be repealed, as its functions are now covered by the Data Act. However, the prohibition of unjustified data localisation, a fundamental principle for the Digital Single Market, will be explicitly preserved.

Similarly, the Data Governance Act  (European Data Governance Regulation or DGA) will be eliminated as a stand-alone rule, migrating its essential provisions to the Data Act. This move simplifies the regulatory framework and also eases the administrative burden: obligations for data intermediaries will become lighter and more voluntary.

As for the public sector, the strategy proposes an important consolidation. The rules on public data sharing, currently dispersed between the DGA and the Open Data Directive, will be merged into a single chapter within the Data Act. This unification will facilitate both the application and the understanding of the legal framework by public administrations.

2.2 Cookie reform: balancing protection and usability

Another relevant detail is the regulation of cookies, which will undergo a significant modernization, being integrated into the framework of the General Data Protection Regulation (GDPR). The reform seeks a balance: on the one hand, low-risk uses that currently generate legal uncertainty will be legalized; on the other,  consent banners will be simplified  through "one-click" systems. The goal is clear: to reduce the so-called "user fatigue" in the face of the repetitive requests for consent that we all know when browsing the Internet.

2.3 Adjustments to the GDPR to facilitate AI development

The General Data Protection Regulation will also be subject to a targeted reform, specifically designed to release data responsibly for the benefit of the development of artificial intelligence. This surgical intervention addresses three specific aspects:

  1. It clarifies when legitimate interest for AI model training may apply.

  2. It defines more precisely the distinction between anonymised and pseudonymised data, especially in relation to the risk of re-identification.

  3. It harmonises data protection impact assessments, facilitating their consistent application across the Union.

2. 4 Implementation and Support for the Data Act

The recently approved Data Act will be subject to adjustments to improve its application. On the one hand, the scope of business-to-government ( B2G) data sharing is refined, strictly limiting it to emergency situations. On the other hand, the umbrella of protection is extended: the favourable conditions currently enjoyed by small and medium-sized enterprises (SMEs) will also be extended to medium-sized companies or small mid-caps, those with between 250 and 749 employees.

To facilitate the practical implementation of the standard, a model contractual clause for data exchange has already been published , thus providing a template that organizations can use directly. In addition, two additional guides will be published during the first quarter of 2026: one on the concept of "reasonable compensation" in data exchanges, and another aimed at clarifying the key definitions of the Data Act that may generate interpretative doubts.

Aware that SMEs may struggle to navigate this new legal framework, a Legal Helpdesk  will be set up in the fourth quarter of 2025. This helpdesk will provide direct advice on the implementation of the Data Act, giving priority precisely to small and medium-sized enterprises that lack specialised legal departments.

2.5 Evolving governance: towards a more coordinated ecosystem

The governance architecture of the European data ecosystem is also undergoing significant changes. The European Data Innovation Board (EDIB) evolves from a primarily advisory body to a forum for more technical and strategic discussions, bringing together both Member States and industry representatives. To this end, its articles will be modified with two objectives: to allow the inclusion of the competent authorities in the debates on Data Act, and to provide greater flexibility to the European Commission in the composition and operation of the body.

In addition, two additional mechanisms of feedback and anticipation are articulated. The Apply AI Alliance will channel  sectoral feedback, collecting the specific experiences and needs of each industry. For its part, the AI Observatory will act as a trend radar, identifying emerging developments in the field of artificial intelligence and translating them into public policy recommendations. In this way, a virtuous circle is closed where politics is constantly nourished by the reality of the field.

Pillar 3: Protecting European data sovereignty

The third pillar focuses on ensuring that European data is treated fairly and securely, both inside and outside the Union's borders. The intention is that data will only be shared with countries with the same regulatory vision.

3.1 Specific measures to protect European data

  • Publication of guides to assess the fair treatment of EU data abroad (Q2 2026):

  • Publication of the Unfair Practices Toolbox  (Q2 2026):

    • Unjustified location.

    • Exclusion.

    • Weak safeguards.

    • The data leak.

  • Taking measures to protect sensitive non-personal data.

All these measures are planned to be implemented from the last quarter of 2025 and throughout 2026 in a progressive deployment that will allow a gradual and coordinated adoption of the different measures, as established in the Data Union Strategy.

In short, the Data Union Strategy represents a comprehensive effort to consolidate European leadership in the data economy. To this end, data pooling and data spaces in the Member States will  be promoted, Data Labs and AI gigafactories will be committed to and regulatory simplification will be encouraged.

calendar icon
Evento

The 17th International Conference on the Reuse of Public Sector Information will be held on December 3 in Madrid. The Multisectoral Association of Information (ASEDIE) organizes this event every year, which in its new edition will take place at the Ministry for Digital Transformation and Public Function in Madrid. Under the slogan "When the standard is not enough: inequality in the application of data regulations", the current challenges around the reuse of public information and the need for agile and effective regulatory frameworks will be addressed.

Regulatory complexity, a challenge to be addressed

This event brings together national and European experts to address the reuse of data as a driver of innovation. Specifically, this year's edition focuses on the need to advance in a regulation that promotes a culture of openness in all administrations, avoiding regulatory fragmentation and ensuring that access to public information translates into a true economic and social value.

Through various presentations and round tables, some of the great current challenges in this area will be addressed: from regulatory simplification to facilitate the reuse of information to open government as a real practice.

The program of the Conference

The event will offer a comprehensive vision of how to move towards a fairer, more open and competitive information ecosystem.

The reception for attendees will take place between 09:00 and 09:30. At that time, the event will begin with the welcome and inauguration, which will be given by Ruth del Campo, general director of data (Secretary of State for Digitalization and Artificial Intelligence). It will be followed by two presentations by the Permanent Representation of Spain to the European Union, by Miguel Valle del Olmo, Minister of Digital Transformation, and Almudena Darias de las Heras, Minister of Justice.

Throughout the day there will be three round tables:

  • 10:15 – 10:45. Roundtable I: Regulatory simplification and legal certainty: Pillars for an agile and efficient framework. Moderated by Ignacio Jiménez, president of ASEDIE, it will feature the participation of Ruth del Campo and Meritxell Borràs i Solé, director of the Catalan Authority for the Protection of Dades.
  • 10:45 – 11:45. Table II: Transparency and Open Government: from theory to practice. Four participants will share their vision and experience in the field: Carmen Cabanillas, Director General of Public Governance (Secretary of State for Public Administration), José Luis Rodríguez Álvarez, President of the Council of Transparency and Good Governance, José Máximo López Vilaboa, Director General of Transparency and Good Governance (Junta de Castilla y León) and Ángela Pérez Brunete,  Director General of Transparency and Quality (Madrid City Council). The conversation will be moderated by Manuel Hurtado, member of the Board of Directors of ASEDIE.
  • 12:35 – 13:35. Table III: Open and transparent registries. Prevent money laundering without slowing down competitiveness. Under the moderation of Valentín Arce, vice-president of ASEDIE, a conversation will take place led by Antonio Fuentes Paniagua, deputy director general of Notaries and Registries (Ministry of the Presidency, Justice and Relations with the Courts Antonio), Andrés Martínez Calvo, Consultant of the Centralised Prevention Body (General Council of Notaries), Carlos Balmisa, technical general secretary of the Association of Property and Commercial Registrars,      and José Luis Perea, general secretary of ATA Autónomos.

During the morning, the following items will also be delivered:

  • The UNE 0080 certification (Guide to the evaluation of Data Quality Governance, Management and Management). This specification develops a homogeneous framework for assessing an organization's maturity with respect to data processing. Find out more about the UNE specifications related to data in this article.
  • The ASEDIE 2025 Award. This international award recognizes each year individuals, companies or institutions that stand out for their contribution to the innovation and development of the infomediary sector. To make visible projects that promote the reuse of public sector information (RISP), highlighting its role in the development of both the Spanish and global economy. You can meet the winners in previous editions here.

The event will end at 1:45 p.m., with a few words from Ignacio Jiménez.

You can see the detailed program on the ASEDIE website.

How to Attend

The 17th ASEDIE Conference is an essential event for those working in the field of information reuse, transparency and data-based innovation. 

This year's event can only be attended in person at the Ministry for Digital Transformation and Public Function (c/ Mármol, 2, Parque Empresarial Rio 55, 28005, Madrid). It is necessary to register through their website.

calendar icon
Blog

The convergence between open data, artificial intelligence and environmental sustainability poses one of the main challenges for the digital transformation model that is being promoted at European level. This interaction is mainly materialized in three outstanding manifestations:

  • The opening of high-value data directly related to sustainability, which can help the development of artificial intelligence solutions aimed at climate change mitigation and resource efficiency.

  • The promotion of the so-called green algorithms in the reduction of the environmental impact of AI, which must be materialized both in the efficient use of digital infrastructure and in sustainable decision-making.

  • The commitment to environmental data spaces, generating digital ecosystems where data from different sources is shared to facilitate the development of interoperable projects and solutions with a relevant impact from an environmental perspective.

Below, we will delve into each of these points.

High-value data for sustainability

 Directive (EU) 2019/1024 on open data and re-use of public sector information introduced for the first time the concept of high-value datasets, defined as those with exceptional potential to generate social, economic and environmental benefits. These sets should be published free of charge, in machine-readable formats, using application programming interfaces (APIs) and, where appropriate, be available for bulk download. A number of priority categories have been identified for this purpose, including environmental and Earth observation data.

This is a particularly relevant category, as it covers both data on climate, ecosystems or environmental quality, as well as those linked to the INSPIRE Directive, which refer to certainly diverse areas such as hydrography, protected sites, energy resources, land use, mineral resources or, among others, those related to areas of natural hazards, including orthoimages.

These data are particularly relevant when it comes to monitoring variables related to climate change, such as land use, biodiversity management taking into account the distribution of species, habitats and protected sites, monitoring of invasive species or the assessment of natural risks. Data on air quality and pollution are crucial for public and environmental health, so access to them allows  exhaustive analyses to be carried out,  which are undoubtedly relevant for the adoption of public policies aimed at improving them. The management of water resources can also be optimized through hydrography data and environmental monitoring, so that its massive and automated treatment is an inexcusable premise to face the challenge of the digitalization of water cycle management.

Combining it with other quality environmental data facilitates the development of AI solutions geared towards specific climate challenges. Specifically, they allow predictive models to be trained to anticipate extreme phenomena (heat waves, droughts, floods), optimize the management of natural resources or monitor critical environmental indicators in real time. It also makes it possible to promote high-impact economic projects, such as the use of AI algorithms to implement technological solutions in the field of precision agriculture, enabling the intelligent adjustment of irrigation systems, the early detection of pests or the optimization of the use of fertilizers.

Green algorithms and digital responsibility: towards sustainable AI

Training and deploying AI systems, particularly general-purpose models and large language models, involves significant energy consumption. According to estimates by the International Energy Agency, data centers accounted for around 1.5% of global electricity consumption in 2024. This represents a growth of around 12% per year since 2017, more than four times faster than the rate of total electricity consumption. Data center power consumption is expected to double to around 945 TWh by 2030.

Against this backdrop, green algorithms are an alternative that must necessarily be taken into account when it comes to minimising the environmental impact posed by the implementation of digital technology and, specifically, AI. In fact, both the European Data Strategy and the European Green Deal explicitly integrate digital sustainability as a strategic pillar. For its part, Spain has launched a National Green Algorithm Programme, framed in the 2026 Digital Agenda and with a specific measure in the National Artificial Intelligence Strategy.

One of the main objectives of the Programme is to promote the development of algorithms that minimise their environmental impact from conception ( green by design), so the requirement of exhaustive documentation of the datasets used to train AI models – including origin, processing, conditions of use and environmental footprint – is essential to fulfil this aspiration. In this regard, the Commission has published a template to help general-purpose AI providers summarise the data used for the training of their models, so that greater transparency can be demanded, which, for the purposes of the present case, would also facilitate traceability and responsible governance from an environmental perspective.  as well as the performance of eco-audits.

The European Green Deal Data Space

It is one of the common European data spaces contemplated in the European Data Strategy that is at a more advanced stage, as demonstrated by the numerous initiatives and dissemination events that have been promoted around it. Traditionally, access to environmental information has been one of the areas with the most favourable regulation, so that with the promotion of high-value data and the firm commitment to the creation of a European area in this area, there has been a very remarkable qualitative advance that reinforces an already consolidated trend in this area.

Specifically, the data spaces model facilitates interoperability between public and private open data, reducing barriers to entry for startups and SMEs in sectors such as smart forest managementprecision agriculture or, among many other examples, energy optimization. At the same time, it reinforces the quality of the data available for Public Administrations to carry out their public policies, since their own sources can be contrasted and compared with other data sets. Finally, shared access to data and AI tools can foster collaborative innovation initiatives and projects, accelerating the development of interoperable and scalable solutions.

However, the legal ecosystem of data spaces entails a complexity inherent in its own institutional configuration, since it brings together several subjects and, therefore, various interests and applicable legal regimes: 

  • On the one hand, public entities, which have a particularly reinforced leadership role in this area.

  • On the other hand, private entities and citizens, who can not only contribute their own datasets, but also offer digital developments and tools that value data through innovative services.

  • And, finally, the providers of the infrastructure necessary for interaction within the space.

Consequently,  advanced governance models are essential to deal with this complexity, reinforced by technological innovation and especially AI, since the traditional approaches of legislation regulating access to environmental information are certainly limited for this purpose.

Towards strategic convergence

The convergence of high-value open data, responsible green algorithms and environmental data spaces is shaping a new digital paradigm that is essential to address climate and ecological challenges in Europe that requires a robust and, at the same time, flexible legal approach. This unique ecosystem not only allows innovation and efficiency to be promoted in key sectors such as precision agriculture or energy management, but also reinforces the transparency and quality of the environmental information available for the formulation of more effective public policies.

Beyond the current regulatory framework, it is essential to design governance models that help to interpret and apply diverse legal regimes in a coherent manner, that protect data sovereignty and, ultimately, guarantee transparency and responsibility in the access and reuse of environmental information. From the perspective of sustainable public procurement, it is essential to promote procurement processes by public entities that prioritise technological solutions and interoperable services based on open data and green algorithms, encouraging the choice of suppliers committed to environmental responsibility and transparency in the carbon footprints of their digital products and services.

Only on the basis of this approach can we aspire to make digital innovation technologically advanced and environmentally sustainable, thus aligning the objectives of the Green Deal, the European Data Strategy and the European approach to AI.

Content prepared by Julián Valero, professor at the University of Murcia and coordinator of the Innovation, Law and Technology Research Group (iDerTec). The content and views expressed in this publication are the sole responsibility of the author.

 

calendar icon
Blog

 Artificial Intelligence (AI) is transforming society, the economy and public services at an unprecedented speed. This revolution brings enormous opportunities, but also challenges related to ethics, security and the protection of fundamental rights. Aware of this, the European Union approved the Artificial Intelligence Act (AI Act), in force since August 1, 2024, which establishes a harmonized and pioneering framework for the development, commercialization and use of AI systems in the single market, fostering innovation while protecting citizens.

A particularly relevant area of this regulation is general-purpose AI models (GPAI), such as large language models (LLMs) or multimodal models, which are trained on huge volumes of data from a wide variety of sources (text, images and video, audio and even user-generated data). This reality poses critical challenges in intellectual property, data protection and transparency on the origin and processing of information.

To address them, the European Commission, through the European AI Office, has published the Template for the Public Summary of Training Content for general-purpose AI models: a standardized format that providers will be required  to complete and publish to summarize key information about the data used in training. From 2 August 2025, any general-purpose model placed on the market or distributed in the EU must be accompanied by this summary; models already on the market have until 2 August 2027 to adapt. This measure materializes the AI Act's principle of transparency and aims to shed light on the "black boxes" of AI.

In this article, we explain this template keys´s: from its objectives and structure, to information on deadlines, penalties, and next steps.

Objectives and relevance of the template

General-purpose AI models are trained on data from a wide variety of sources and modalities, such as:

  • Text: books, scientific articles, press, social networks.

  • Images and videos: digital content from the Internet and visual collections.

  • Audio: recordings, podcasts, radio programs, or conversations.

  • User data: information generated in interaction with the model itself or with other services of the provider.

This process of mass data collection is often opaque, raising concerns among rights holders, users, regulators, and society as a whole. Without transparency, it is difficult to assess whether data has been obtained lawfully, whether it includes unauthorised personal information or whether it adequately represents the cultural and linguistic diversity of the European Union.

Recital 107 of the AI Act states that the main objective of this template is to increase transparency and facilitate the exercise and protection of rights. Among the benefits it provides, the following stand out:

  1. Intellectual property protection: allows authors, publishers and other rights holders to identify if their works have been used during training, facilitating the defense of their rights and a fair use of their content.

  2. Privacy safeguard:  helps detect whether personal data has been used, providing useful information so that affected individuals can exercise their rights under the General Data Protection Regulation (GDPR) and other regulations in the same field.

  3. Prevention of bias and discrimination: provides information on the linguistic and cultural diversity of the sources used, key to assessing and mitigating biases that may lead to discrimination.

  4. Fostering competition and research: reduces "black box" effects and facilitates academic scrutiny, while helping other companies better understand where data comes from, favoring more open and competitive markets.

In short, this template is not only a legal requirement, but a tool to build trust in artificial intelligence, creating an ecosystem in which technological innovation and the protection of rights are mutually reinforcing.

Template structure

The template, officially published on 24 July 2025 after a public consultation with more than 430 participating organisations, has been designed so that the information is presented in a clear, homogeneous and understandable way, both for specialists and for the public.

It consists of three main sections, ranging from basic model identification to legal aspects related to data processing.

1. General information

It provides a global view of the provider, the model, and the general characteristics of the training data:

  • Identification of the supplier, such as name and contact details.

  • Identification of the model and its versions, including dependencies if it is a modification (fine-tuning) of another model.

  • Date of placing the model on the market in the EU.

  • Data modalities used (text, image, audio, video, or others).

  • Approximate size of data by modality, expressed in wide ranges (e.g., less than 1 billion tokens, between 1 billion and 10 trillion, more than 10 trillion).

  • Language coverage, with special attention to the official languages of the European Union.

This section provides a level of detail sufficient to understand the extent and nature of the training, without revealing trade secrets.

2. List of data sources

It is the core of the template, where the origin of the training data is detailed. It is organized into six main categories, plus a residual category (other).

  1. Public datasets:

    • Data that is freely available and downloadable as a whole or in blocks (e.g., open data portals, common crawl, scholarly repositories).

    • "Large" sets must be identified, defined as those that represent more than 3% of the total public data used in a specific modality.

  2. Licensed private sets:

    • Data obtained through commercial agreements with rights holders or their representatives, such as licenses with publishers for the use of digital books.

    • A general description is provided only.

  3. Other unlicensed private data:

    • Databases acquired from third parties that do not directly manage copyright.

    • If they are publicly known, they must be listed; otherwise, a general description (data type, nature, languages) is sufficient.

  4. Data obtained through web crawling/scraping:

    • Information collected by or on behalf of the supplier using automated tools.

    • It must be specified:

      • Name/identifier of the trackers.

      • Purpose and behavior (respect for robots.txt, captchas, paywalls, etc.).

      • Collection period.

      • Types of websites (media, social networks, blogs, public portals, etc.).

      • List of most relevant domains, covering at least the top 10% by volume. For SMBs, this requirement is adjusted to 5% or a maximum of 1,000 domains, whichever is less.

  5. Users data:

    • Information generated through interaction with the model or with other provider services.

    • It must indicate which services contribute and the modality of the data (text, image, audio, etc.).

  6. Synthetic data:

    • Data created by or for the supplier using other AI models (e.g., model distillation or reinforcement with human feedback - RLHF).

    • Where appropriate, the generator model should be identified if it is available in the market.

Additional category – OtherIncludes data that does not fit into the above categories, such as offline sources, self-digitization, manual tagging, or human generation.

3. Aspects of data processing

It focuses on how data has been handled before and during training, with a particular focus on legal compliance:

  • Respect for Text and Data Mining (TDM): measures taken to honour the right of exclusion provided for in Article 4(3) of Directive 2019/790 on copyright, which allows rightholders to prevent the mining of texts and data. This right is exercised through opt-out protocols, such as tags in files or configurations in robots.txt, that indicate that certain content cannot be used to train models. Vendors should explain how they have identified and respected these opt-outs in their own datasets and in those purchased from third parties.

  • Removal of illegal content: procedures used to prevent or debug content that is illegal under EU law, such as child sexual abuse material, terrorist content or serious intellectual property infringements. These mechanisms may include blacklisting, automatic classifiers, or human review, but without revealing trade secrets.

The following diagram summarizes these three sections:

Template for the Public Summary of Training Content: essential information to be disclosed about the data used to train general-purpose AI models marketed in the European Union.  General information  Identification of the supplier  Identification of the model and its versions  Date of placing the model on the market in the EU  Data modalities used (text, image, audio, video, or others)  Approximate size of data per modality  Language coverage  List of data sources  Public datasets  Licensed private datasets  Other unlicensed private data  Data obtained through web crawling/scraping  User data  Synthetic data  Additional category – Other (e.g., offline sources)  Aspects of data processing  Respect for reserved rights (Text and Data Mining, TDM)  Removal of illegal content  Source:  Template for the Public Summary of Training Content, European Commission (July 2025).

Balancing transparency and trade secrets

The European Commission has designed the template seeking a delicate balance: offering sufficient information to protect rights and promote transparency, without forcing the disclosure of information that could compromise the competitiveness of suppliers.

  • Public sources: the highest level of detail is required, including names and links to "large" datasets.

  • Private sources: a more limited level of detail is allowed, through general descriptions when the information is not public.

  • Web scraping: a summary list of domains is required, without the need to detail exact combinations.

  • User and synthetic data: the information is limited to confirming its use and describing the modality.

Thanks to this approach, the summary is "generally complete" in scope, but not "technically detailed", protecting both transparency  and  the intellectual and commercial property of companies.

Compliance, deadlines and penalties

Article 53 of the AI Act details the obligations of general-purpose model providers, most notably the publication of this summary of training data.

This obligation is complemented by other measures, such as:

  • Have a public copyright policy.

  • Implement risk assessment and mitigation processes, especially for models that may generate systemic risks.

  • Establish mechanisms for traceability and supervision of data and training processes.

Non-compliance can lead  to significant fines, up to €15 million or 3% of the company's annual global turnover, whichever is higher.

Next Steps for Suppliers

To adapt to this new obligation, providers should:

  1. Review internal data collection and management processes to ensure that necessary information is available and verifiable.

  2. Establish clear transparency and copyright policies, including protocols to respect the right of exclusion in text and data mining (TDM).

  3. Publish the abstract on official channels before the corresponding deadline.

  4. Update the summary periodically, at least every six months or when there are material changes in training.

The European Commission, through the European AI Office, will monitor compliance and may request corrections or impose sanctions.

A key tool for governing data

In our previous article, "Governing Data to Govern Artificial Intelligence", we highlighted that reliable AI is only possible if there is a solid governance of data.

This new template reinforces that principle, offering a standardized mechanism for describing the lifecycle of data, from source to processing, and encouraging interoperability and responsible reuse.

This is a decisive step towards a more transparent, fair and  aligned AI with European values, where the protection of rights and technological innovation can advance together.

Conclusions

The publication of the Public Summary Template marks a historic milestone in the regulation of AI in Europe. By requiring providers to document and make public the data used in training, the European Union is taking a decisive step towards a more transparent and trustworthy artificial intelligence, based on responsibility and respect for fundamental rights. In a world where data is the engine of innovation, this tool becomes the key to governing data before governing AI, ensuring that technological development is built on trust and ethics.

Content created by Dr. Fernando Gualo, Professor at UCLM and Government and Data Quality Consultant. The content and views expressed in this publication are the sole responsibility of the author.

calendar icon
Blog

When dealing with the liability arising from the use of autonomous systems based on the use of artificial intelligence , it is common to refer to the ethical dilemmas that a traffic accident can pose. This example is useful to illustrate the problem of liability for damages caused by an accident or even to determine other types of liability in the field of road safety (for example, fines for violations of traffic rules).

Let's imagine that the autonomous vehicle has been driving at a higher speed than the permitted speed or that it has simply skipped a signal and caused an accident involving other vehicles. From the point of view of the legal risks, the liability that would be generated and, specifically, the impact of data in this scenario, we could ask some questions that help us understand the practical scope of this problem:

  • Have all the necessary datasets of sufficient quality to deal with traffic risks in different environments (rural, urban, dense cities, etc.) been considered in the design and training?

  • What is the responsibility if the accident is due to poor integration of the artificial intelligence tool with the vehicle or a failure of the manufacturer that prevents the correct reading of the signs?

  • Who is responsible if the problem stems from incorrect or outdated information on traffic signs?

In this post we are going to explain what aspects must be considered when assessing the liability that can be generated in this type of case.

The impact of data from the perspective of the subjects involved

In the design, training, deployment and use of artificial intelligence systems, the effective control of the data used plays an essential role in the management of legal risks. The conditions of its processing can have important consequences from the perspective of liability in the event of damage or non-compliance with the applicable regulations.

A rigorous approach to this problem requires distinguishing according to each of the subjects involved in the process, from its initial development to its effective use in specific circumstances, since the conditions and consequences can be very different. In this sense, it is necessary to identify the origin of the damage or non-compliance in order to impute the legal consequences to the person who should effectively be considered responsible:

  • Thus, damage or non-compliance may be determined by a design problem in the application used or in its training, so that certain data is misused for this purpose. Continuing with the example of an autonomous vehicle, this would be the case of accessing the data of the people traveling in it without consent.

  • However, it is also possible that the problem originates from the person who deploys the tool in each environment for real use, a position that would be occupied by the vehicle manufacturer. This could happen if, for its operation, data is accessed without the appropriate permissions or if there are restrictions that prevent access to the information necessary to guarantee its proper functioning.

  • The problem could also be generated by the person or entity using the tool itself. Returning to the example of the vehicle, it could be stated that the ownership of the vehicle corresponds to a company or an individual that has not carried out the necessary periodic inspections or updated the system when necessary.

  • Finally, there is the possibility that the legal problem of liability is determined by the conditions under which the data are provided at their original source. For example, if the data is inaccurate: the information about the road on which the vehicle is traveling is not up to date or the data emitted by traffic signs is not sufficiently accurate.

Challenges related to the technological environment: complexity and opacity

In addition, the very uniqueness of the technology used may significantly condition the attribution of liability. Specifically, technological opacity – that is, the difficulty in understanding why a system makes a specific decision – is one of the main challenges when it comes to addressing the legal challenges posed by artificial intelligence, as it makes it difficult to determine the responsible subject. This is a problem that acquires special importance with regard to the lawful origin of the data and, likewise, the conditions under which its processing takes place. In fact, this was precisely the main stumbling block that generative artificial intelligence encountered in the initial moments of its landing in Europe: the lack of adequate conditions of transparency regarding the processing of personal data justified the temporary halt of its commercialization until the necessary adjustments were made.

In this sense, the publication of the data used for the training phase becomes an additional guarantee from the perspective of legal certainty and, specifically, to verify the regulatory compliance conditions of the tool.

On the other hand, the complexity inherent in this technology poses an additional difficulty in terms of the imputation of the damage that may be caused and, consequently, in the determination of who should pay for it. Continuing with the example of the autonomous vehicle, it could be the case that various causes overlap, such as the inaccuracy of the data provided by traffic signs and, at the same time, a malfunction of the computer application by not detecting potential inconsistencies between the data used and its actual needs.

What does the regulation of the European Regulation on artificial intelligence say about it?

Regulation (EU) 2024/1689 establishes a harmonised regulatory framework across the European Union in relation to artificial intelligence. With regard to data, it includes some specific obligations for systems classified as "high risk", which are those contemplated in Article 6 and in the list in Annex III (biometric identification, education, labour management, access to essential services, etc.). In this sense, it incorporates a strict regime of technical requirements, transparency, supervision and auditing, combined with conformity assessment procedures prior to its commercialization and post-market control mechanisms, also establishing precise responsibilities for suppliers, operators and other actors in the value chain.

As regards data governance, a risk management system  should be put in place covering the entire lifecycle of the tool and assessing, mitigating, monitoring and documenting risks to health, safety and fundamental rights. Specifically, training, validation, and testing datasets are required to be:

  • Relevant, representative, complete and as error-free as possible for the intended purpose.

  • Managed in accordance with strict governance practices that mitigate bias and discrimination, especially when they may affect the fundamental rights of vulnerable or minority groups.

  • The Regulation also lays down strict conditions for the exceptional use of special categories of personal data with regard to the detection and, where appropriate, correction of bias.

With regard to technical documentation and record keeping, the following are required:

  • The preparation and maintenance of exhaustive technical documentation. In particular, with regard to transparency, complete and clear instructions for use should be provided,      including information on data and output results, among other things.

  • Systems should allow for the automatic recording of relevant events (logs) throughout their life cycle to ensure traceability and facilitate post-market surveillance, which can be very useful when checking the incidence of the data used.

As regards liability, that regulation is based on an approach that is admittedly limited from two points of view:

  • Firstly, it merely empowers Member States to establish a sanctioning regime that provides for the imposition of fines and other means of enforcement, such as warnings and non-pecuniary measures, which must be effective, proportionate and dissuasive of non-compliance with the regulation. They are, therefore, instruments of an administrative nature and punitive in nature, that is, punishment for non-compliance with the obligations established in said regulation, among which are those relating to data governance and the documentation and conservation of records referred to above.

  • However, secondly, the European regulator has not considered it appropriate to establish specific provisions regarding civil liability with the aim of compensating for the damage caused. This is an issue of great relevance that even led the European Commission to formulate a proposal for a specific Directive in 2022. Although its processing has not been completed, it has given rise to an interesting debate whose main arguments have been systematised in a comprehensive report by the European Parliament analysing the impact that this regulation could have.

No clear answers: open debate and regulatory developments

Thus, despite the progress made by the approval of the 2024 Regulation, the truth is that the regulation of liability arising from the use of artificial intelligence tools remains an open question on which there is no complete and developed regulatory framework. However, once the approach regarding the legal personification of robots that arose a few years ago has been overcome, it is unquestionable that artificial intelligence in itself cannot be considered a legally responsible subject.

As emphasized above, this is a complex debate in which it is not possible to offer simple and general answers, since it is essential to specify them in each specific case, taking into account the subjects that have intervened in each of the phases of design, implementation and use of the corresponding tool. It will therefore be these subjects who will have to assume the corresponding responsibility, either for the compensation of the damage caused or, where appropriate, to face the sanctions and other administrative measures in the event of non-compliance with the regulation.

In short, although the European regulation on artificial intelligence of 2024 may be useful to establish standards that help determine when a damage caused is contrary to law and, therefore, must be compensated, the truth is that it is an unclosed debate that will have to be redirected applying the general rules on consumer protection or defective products, taking into account the singularities of this technology. And, as far as administrative responsibility is concerned, it will be necessary to wait for the initiative that was announced a few months ago  and that is pending formal approval by the Council of Ministers for its subsequent parliamentary processing in the Spanish Parliament.

Content prepared by Julián Valero, Professor at the University of Murcia and Coordinator of the Research Group "Innovation, Law and Technology" (iDerTec). The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon
Blog

The idea of conceiving artificial intelligence (AI) as a service for immediate consumption or utility, under the premise that it is enough to "buy an application and start using it", is gaining more and more ground. However, getting on board with AI isn't like buying conventional software and getting it up and running instantly. Unlike other information technologies, AI will hardly be able to be used with the philosophy of plug and play. There is a set of essential tasks that users of these systems should undertake, not only for security and legal compliance reasons, but above all to obtain efficient and reliable results.

The Artificial Intelligence Regulation (RIA)[1]

The RIA defines frameworks that should be taken into account by providers[2] and those responsible for deploying[3] AI. This is a very complex rule whose orientation is twofold. Firstly, in an approach that we could define as high-level, the regulation establishes a set of red lines that can never be crossed. The European Union approaches AI from a human-centred and human-serving approach. Therefore, any development must first and foremost ensure that fundamental rights are not violated or that no harm is caused to the safety and integrity of people. In addition, no AI that could generate systemic risks to democracy and the rule of law will be admitted. For these objectives to materialize, the RIA deploys a set of processes through a product-oriented approach. This makes it possible to classify AI systems according to their level of risk, -low, medium, high- as well as general-purpose AI models[4]. And also, to establish, based on this categorization, the obligations that each participating subject must comply with to guarantee the objectives of the standard.

Given the extraordinary complexity of the European regulation, we would like to share in this article some common principles that can be deduced from reading it and could inspire good practices on the part of public and private organisations. Our approach is not so much on defining a roadmap for a given information system as on highlighting some elements that we believe can be useful in ensuring that the deployment and use of this technology are safe and efficient, regardless of the level of risk of each AI-based information system.

Define a clear purpose

The deployment of an AI system is highly dependent on the purpose pursued by the organization. It is not about jumping on the bandwagon of a fashion. It is true that the available public information seems to show that the integration of this type of technology is an important part of the digital transformation processes of companies and the Administration, providing greater efficiency and capabilities. However, it cannot become a fad to install any of the Large Language Models (LLMs). Prior reflection is needed that takes into account what the needs of the organization are and defines what type of AI will contribute to the improvement of our capabilities. Not adopting this strategy could put our bank at risk, not only from the point of view of its operation and results, but also from a legal perspective. For example, introducing an LLM or chatbot  into a high-decision-making risk environment could result in reputational impacts or liability. Inserting this LLM in a medical environment, or using a chatbot in a sensitive context with an unprepared population or in critical care processes, could end up generating risk situations with unforeseeable consequences for people.

Do no evil

The principle of non-malefficiency is a key element and should decisively inspire our practice in the world of AI. For this reason, the RIA establishes a series of practices expressly prohibited to protect the fundamental rights and security of people. These prohibitions focus on preventing manipulations, discrimination, and misuse of AI systems that can cause significant harm.

Categories of Prohibited Practices

1. Manipulation and control of behavior. Through the use of subliminal or manipulative techniques that alter the behavior of individuals or groups, preventing informed decision-making and causing considerable damage.

2. Exploiting vulnerabilities. Derived from age, disability or social/economic situation to substantially modify behavior and cause harm.

3. Social Scoring. AI that evaluates people based on their social behavior or personal characteristics, generating ratings with effects for citizens that result in unjustified or disproportionate treatment.

4. Criminal risk assessment based on profiles. AI used to predict the likelihood of committing crimes solely through profiling or personal characteristics. Although its use for criminal investigation is admitted when the crime has actually been committed and there are facts to be analyzed.

5. Facial recognition and biometric databases. Systems for the expansion of facial recognition databases through the non-selective extraction of facial images from the Internet or closed circuit television.

6. Inference of emotions in sensitive environments. Designing or using AI to infer emotions at work or in schools, except for medical or safety reasons.

7. Sensitive biometric categorization. Develop or use AI that classifies individuals based on biometric data to infer race, political opinions, religion, sexual orientation, etc.

8. Remote biometric identification in public spaces. Use of "real-time" remote biometric identification systems in public spaces for police purposes, with very limited exceptions (search for victims, prevention of serious threats, location of suspects of serious crimes).

Apart from the expressly prohibited conduct, it is important to bear in mind that the principle of non-maleficence implies that we cannot use an AI system with the clear intention of causing harm, with the awareness that this could happen or, in any case, when the purpose we pursue is contrary to law.

Ensure proper data governance

The concept of data governance is found in Article 10 of the RIA and applies to high-risk systems. However, it contains a set of principles that are highly cost-effective when deploying a system at any level. High-risk AI systems that use data must be developed with training, validation, and testing suites that meet quality criteria. To this end, certain governance practices are defined to ensure:

  • Proper design.
  • That the collection and origin of the data, and in the case of personal data the purpose pursued, are adequate and legitimate.
  • Preparation processes such as annotation, labeling, debugging, updating, enrichment, and aggregation are adopted.
  • That the system is designed with use cases whose information is consistent with what the data is supposed to measure and represent.
  • Ensure data quality by ensuring the availability, quantity, and adequacy of the necessary datasets.
  • Detect and review biases that may affect the health and safety of people, rights or generate discrimination, especially when data outputs influence the input information of future operations. Measures should be taken to prevent and correct these biases.
  • Identify and resolve gaps or deficiencies in data that impede RIA compliance, and we would add legislation.
  • The datasets used should be relevant, representative, complete and with statistical properties appropriate for their intended use and should consider the geographical, contextual or functional characteristics necessary for the system, as well as ensure its diversity. In addition, they shall be error-free and complete in view of their intended purpose.

AI is a technology that is highly dependent on the data that powers it. From this point of view, not having data governance can not only affect the operation of these tools, but could also generate liability for the user.

In the not too distant future, the obligation for high-risk systems to obtain a CE marking issued by a notified body (i.e., designated by a member state of the European Union) will provide conditions of reliability to the market. However, for the rest of the lower-risk systems, the obligation of transparency applies. This does not at all imply that the design of this AI should not take these principles into account as far as possible. Therefore, before making a contract, it would be reasonable to verify the available pre-contractual information both in relation to the characteristics of the system and its reliability and with respect to the conditions and recommendations for deployment and use.

Another issue concerns our own organization. If we do not have the appropriate regulatory, organizational, technical and quality compliance measures that ensure the reliability of our own data, we will hardly be able to use AI tools that feed on it. In the context of the RIA, the user of a system may also incur liability. It is perfectly possible that a product of this nature has been properly developed by the supplier and that in terms of reproducibility the supplier can guarantee that under the right conditions the system works properly. What developers and vendors cannot solve are the inconsistencies in the datasets that the user-client integrates into the platform. It is not your responsibility if the customer failed to properly deploy a General Data Protection Regulation compliance framework or is using the system for an unlawful purpose. Nor will it be their responsibility for the client to maintain outdated or unreliable data sets that, when introduced into the tool, generate risks or contribute to inappropriate or discriminatory decision-making.

Consequently, the recommendation is clear: before implementing an AI-based system, we must ensure that data governance and compliance with current legislation are adequately guaranteed.

Ensuring Safety

AI is a particularly sensitive technology that presents specific security risks, such as the corruption of data sets. There is no need to look for fancy examples. Like any information system, AI requires organizations to deploy and use them securely. Consequently, the deployment of AI in any environment requires the prior development of a risk analysis that allows identifying which are the organizational and technical measures that guarantee a safe use of the tool.

Train your staff

Unlike the GDPR, in which this issue is implicit, the RIA expressly establishes the duty to train as an obligation. Article 4 of the RIA is so precise that it is worthwhile to reproduce it in its entirety:

Providers and those responsible for deploying AI systems shall take measures to ensure that, to the greatest extent possible, their staff and others responsible on their behalf for the operation and use of AI systems have a sufficient level of AI literacy, taking into account their technical knowledge;  their experience, education and training, as well as the intended context of use of AI systems and the individuals or groups of people in whom those systems are to be used.

This is certainly a critical factor. People who use artificial intelligence must have been given adequate training that allows them to understand the nature of the system and be able to make informed decisions. One of the core principles of European legislation and approach is that of human supervision. Therefore, regardless of the guarantees offered by a given market product, the organization that uses it will always be responsible for the consequences. And this will happen both in the case where the last decision is attributed to a person, and when in highly automated processes those responsible for its management are not able to identify an incident by making appropriate decisions with human supervision.

Guilt in vigilando

The massive introduction of LLMs poses the risk of incurring the so-called culpa in vigilando: a legal principle that refers to the responsibility assumed by a person for not having exercised due vigilance over another, when that lack of control results in damage or harm. If your organization has introduced any of these marketplace products that integrate functions such as reporting, evaluating alphanumeric information, and even assisting you in email management, it will be critical that you ensure compliance with the recommendations outlined above. It is particularly advisable to define very precisely the purposes for which the tool is implemented, the roles and responsibilities of each user, and to document their decisions and to train staff appropriately.

Unfortunately, the model of introduction of LLMs  into the market has itself generated a systemic and serious risk for organizations. Most tools have opted for a marketing strategy that is no different from the one used by social networks in their day. That is, they allow open and free access to anyone. It is obvious that with this they achieve two results: reuse the information provided to them by monetizing the product and generate a culture of use that facilitates the adoption and commercialization of the tool.

Let's imagine a hypothesis, of course, that is far-fetched. A resident intern (MIR) has discovered that several of these tools have been developed and, in fact, are used in another country for differential diagnosis. Our MIR is very worried about having to wake up the head of medical duty in the hospital every 15 minutes. So, diligently, he hires a tool, which has not been planned for that use in Spain, and makes decisions based on the proposal of differential diagnosis of an LLM without yet having the capabilities that enable it for human supervision. Obviously, there is a significant risk of ending up causing harm to a patient.

Situations such as the one described force us to consider how organizations that do not use AI but are aware of the risk that their employees use them without their knowledge or consent should act. In this regard, a preventive strategy should be adopted based on the issuance of very precise circulars and instructions regarding the prohibition of their use. On the other hand, there is a hybrid risk situation. The LLM has been contracted by the organization and is used by the employee for purposes other than those intended. In this case, the safety-training duo acquires a strategic value.

Training and the acquisition of culture about artificial intelligence are probably an essential requirement for society as a whole. Otherwise, the systemic problems and risks that in the past affected the deployment of the Internet will happen again and who knows if with an intensity that is difficult to govern.

Content prepared by Ricard Martínez, Director of the Chair of Privacy and Digital Transformation. Professor, Department of Constitutional Law, Universitat de València. The contents and points of view reflected in this publication are the sole responsibility of its author.

NOTES:

 [1] Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised standards in the field of artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 available in https://eur-lex.europa.eu/legal-content/ES/TXT/?uri=OJ%3AL_202401689  

[2] The RIA defines 'provider' as a natural or legal person, public authority, body or agency that develops an AI system or a general-purpose AI model or for which an AI system or a general-purpose AI model is developed and places it on the market or puts the AI system into service under its own name or brand;  for a fee or free of charge. 

[3] The RIA defines "deployment controller" as a natural or legal person, or public authority, body, office or agency that uses an AI system under its own authority, except where its use is part of a personal activity of a non-professional nature. 

[4] The RIA defines a 'general-purpose AI model' as an AI model, also one trained on a large volume of data using large-scale self-monitoring, which has a considerable degree of generality and is capable of competently performing a wide variety of different tasks, regardless of how the model is introduced to the market.  and that it can be integrated into various downstream systems or applications, except for AI models that are used for research, development, or prototyping activities prior to their introduction to the market.

calendar icon
Blog

Just a few days ago, the Directorate General of Traffic published the new Framework Programme for the Testing of Automated Vehicles which, among other measures, contemplates "the mandatory delivery of reports, both periodic and final and in the event of incidents, which will allow the DGT to assess the safety of the tests and publish basic information [...] guaranteeing transparency and public trust."

The advancement of digital technology is making it easier for the transport sector to face an unprecedented revolution in autonomous vehicle driving, offering significant improvements in road safety, energy efficiency and mobility accessibility.

The final deployment of these vehicles depends to a large extent on the availability, quality and accessibility of large volumes of data, as well as on an appropriate legal framework that ensures the protection of the various legal assets involved (personal data, trade secrets, confidentiality, etc.), traffic security and transparency. In this context, open data and the reuse of public sector information are essential elements for the responsible development of autonomous mobility, in particular when it comes to ensuring adequate levels of traffic safety.

Data Dependency on Autonomous Vehicles

The technology that supports autonomous vehicles is based on the integration of a complex network of advanced sensors, artificial intelligence systems and real-time processing algorithms, which allows them to identify obstacles, interpret traffic signs, predict the behavior of other road users and, in a collaborative way, plan routes completely autonomously.

In the autonomous vehicle ecosystem, the availability of quality open data is strategic for:

  • Improve road safety, so that real-time traffic data can be used to anticipate dangers, avoid accidents and optimise safe routes based on massive data analysis.
  • Optimise operational efficiency, as access to up-to-date information on the state of roads, works, incidents and traffic conditions allows for more efficient planning of journeys.
  • To promote sectoral innovation, facilitating the creation of new digital tools that facilitate mobility.

Specifically, ensuring the safe and efficient operation of this mobility model requires continuous access to two key categories of data:

  • Variable or dynamic data, which offers constantly changing information such as the position, speed and behaviour of other vehicles, pedestrians, cyclists or weather conditions in real time.
  • Static data, which includes relatively permanent information such as the exact location of traffic signs, traffic lights, lanes, speed limits or the main characteristics of the road infrastructure.

The prominence of the data provided by public entities

The sources from which such data come are certainly diverse. This is of great relevance as regards the conditions under which such data will be available. Specifically, some of the data are provided by public entities, while in other cases the origin comes from private companies (vehicle manufacturers, telecommunications service providers, developers of digital tools...) with their own interests or even from people who use public spaces, devices and digital applications.

This diversity requires a different approach to facilitating the availability of data under appropriate conditions, in particular because of the difficulties that may arise from a legal point of view. In relation to Public Administrations, Directive (EU) 2019/1024 on open data and the reuse of public sector information establishes clear obligations that would apply, for example, to the Directorate General of Traffic, the Administrations owning public roads or municipalities in the case of urban environments. Likewise, Regulation (EU) 2022/868 on European data governance reinforces this regulatory framework, in particular with regard to the guarantee of the rights of third parties and, in particular, the protection of personal data.

Moreover, some datasets should be provided under the conditions established for dynamic data, i.e. those "subject to frequent or real-time updates, due in particular to their volatility or rapid obsolescence", which should be available "for re-use immediately after collection, through appropriate APIs and,  where appropriate, in the form of a mass discharge."

One might even think that the high-value data category  is of particular interest in the context of autonomous vehicles given its potential to facilitate mobility, particularly considering its potential to:

  • To promote technological innovation, as they would make it easier for manufacturers, developers and operators to access reliable and up-to-date information, essential for the development, validation and continuous improvement of autonomous driving systems.
  • Facilitate monitoring and evaluation from a security perspective, as transparency and accessibility of such data are essential prerequisites from this perspective.
  • To boost the development of advanced services, since data on road infrastructure, signage, traffic and even the results of tests carried out in the context of the aforementioned Framework Programme constitute the basis for new mobility applications and services that benefit society as a whole.

However, this condition is not expressly included for traffic-related data in the definition made at European level, so that, at least for the time being, public entities should not be required to disseminate the data that apply to autonomous vehicles under the unique conditions established for high-value data. However, at this time of transition for the deployment of autonomous vehicles, it is essential that public administrations publish and keep updated under appropriate conditions for their automated processing, some datasets, such as those relating to:

  • Road signs and vertical signage elements.
  • Traffic light states and traffic control systems.
  • Lane configuration and characteristics.
  • Information on works and temporary traffic alterations.
  • Road infrastructure elements critical for autonomous navigation.

The recent update of the official catalogue of traffic signs, which comes into force on 1 July 2025, incorporates signs adapted to new realities, such as personal mobility. However, it requires greater specificity with regard to the availability of data relating to signals under these conditions. This will require the intervention of the authorities responsible for road signage.

The availability of data in the context of the European Mobility Area

Based on these conditions and the need to have mobility data generated by private companies and individuals, data spaces appear as the optimal legal and governance environment to facilitate their accessibility under appropriate conditions.

In this regard, the initiatives for the deployment of the European Mobility Data Space, created in 2023, constitute an opportunity to integrate into its design and configuration measures that support the need for access to data required by autonomous vehicles. Thus, within the framework of this initiative, it would be possible to unlock the potential of mobility data , and in particular:

  • Facilitate the availability of data under conditions specific to the needs of autonomous vehicles.
  • Promote the interconnection of various data sources linked to existing means of transport, but also emerging ones.
  • Accelerate the digital transformation of autonomous vehicles.
  • Strengthen the digital sovereignty of the European automotive industry, reducing dependence on large foreign technology corporations.

In short, autonomous vehicles can represent a fundamental transformation in mobility as it has been conceived until now, but their development depends, among other factors, on the availability, quality and accessibility of sufficient and adequate data. The Sustainable Mobility Bill currently being processed in Parliament is a great opportunity to strengthen the role of data in facilitating innovation in this area, which would undoubtedly favour the development of autonomous vehicles. To this end, it will be essential, on the one hand, to have a data sharing environment that makes access to data compatible with the appropriate guarantees for fundamental rights and information security; and, on the other hand, to design a governance model that, as emphasised in the Programme promoted by the Directorate-General for Traffic,  facilitates the collaborative participation of "manufacturers, developers, importers and fleet operators established in Spain or the European Union", which poses significant challenges in the availability of data.


Content prepared by Julián Valero, Professor at the University of Murcia and Coordinator of the Research Group "Innovation, Law and Technology" (iDerTec). The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon
Blog

 The Work Trends 2024 Index on the State of Artificial Intelligence in the Workplace and reports from T-Systems and InfoJobs indicate that 78% of workers in Spain use their own AI tools in the workplace. This figure rises to 80% in medium-sized companies. In addition, 1 in 3 workers (32%) use AI tools in their day-to-day work. 75% of knowledge workers use generative AI tools, and almost half have started doing so in the last six months. Interestingly, the generation gap is narrowing in this area. While 85% of Generation Z employees (18-28 years old) use personalised AI, it turns out that more than 70% of baby boomers (58+) also use these tools. In fact, this trend seems to be confirmed by different approaches.

Títle of the study Source
2024 Work Trend Index: AI at work is here. Now comes the hard part Microsoft, LinkedIn
2024 AI Adoption and Risk Report Cyberhaven Labs
Generative AI''s fast and furious entry into Switzerland Deloitte Switzerland
Bring Your Own AI: Balance Rewards and Risks (Webinar) MITSloan
Lin, L. and Parker, K. (2025) U.S. workers are more worried than hopeful about future AI use in the Workplace Pew Research Center

Figure 1. References on BYOAI

This phenomenon has been called BYOAI (Bring Your Own AI ), for short. It is characterised by the fact that the person employed usually uses some kind of open source solution such as ChatGPT. The organisation has not contracted the service, the registration has been made privately by the user and the provider obviously assumes no legal responsibility. If, for example, the possibilities offered by Notebook, Perplexity or DeepSeek are used, it is perfectly possible to upload confidential or protected documents.

On the other hand, this coincides, according to data from EuroStat, with the adoption of AI in the corporate sector. By 2024, 13.5% of European companies (with 10 or more employees) were using some form of AI technology, a figure that rises to 41% in large companies and is particularly high in sectors such as information and communication (48.7%), professional, scientific and technical services (30.5%). The trend towards AI adoption in the public sector is also growing due not only to global trends, but probably to the adoption of AI strategies and the positive impact of Next Generation funds.

The legal duty of AI literacy

In this context, questions immediately arise. The first concern the phenomenon of unauthorised use by employed persons: Has the data protection officer or the security officer issued a report to the management of the organisation? Has this type of use been authorised? Was the matter discussed at a meeting of the Security Committee? Has an information circular been issued defining precisely the applicable rules? But alongside these emerge others of a more general nature: What level of education do people have? Are they able to issue reports or make decisions using such tools?

The EU Regulation on Artificial Intelligence (RIA) has rightly established a duty of AI literacy imposed on the providers and deployers of such systems. They are responsible for taking measures to ensure that, to the greatest extent possible, their staff and others who are responsible for the operation and use of AI systems on their behalf have a sufficient level of AI literacy. This requires taking into account their expertise, experience, education and training. Training should be integrated into the intended context of use of the AI systems and be tailored to the profile of the individuals or groups in which the systems will be used.

Unlike in the General Data Protection Regulation, here the obligation is formulated in an express and imperative manner.. There is no direct reference to this matter in the GDPR, except in defining as a function of the data protection officer the training of staff involved in processing operations. This need can also be deduced from the obligation of the processor to ensure that persons authorised to process personal data are aware of their duty of confidentiality. It is obvious that the duty of proactive accountability, data protection by design and by default and risk management lead to the training of users of information systems. However, the fact is that the way in which this training is deployed is not always appropriate. In many organisations it is either non-existent, voluntary or based on the signing of a set of security obligations when taking up a job.

In the field of artificial intelligence-based information systems, the obligation to train is non-negotiable and imperative. The RIA provides for very high fines specified in the Bill for the good use and governance of Artificial Intelligence. When the future law is passed, it will be a serious breach of Article 26.2 of the RIA, concerning the need to entrust the human supervision of the system to persons with adequate competence, training and authority.

Benefits of AI training

Beyond legal coercion, training people is a wise and undoubtedly beneficial decision that should be read positively and conceived as an investment. On the one hand, it helps to adopt measures aimed at managing risk which in the case of the BYOAI includes data leakage, loss of intellectual property, compliance issues and cybersecurity. On the other hand, it is necessary to manage risks associated with regular use of AI. In this regard, it is essential that end-users have a very detailed understanding of the ways in which the technology works, its human oversight role in the decision-making process, and that they acquire the ability to identify and report any operational issues.

However, training must pursue high-level objectives. It should be continuous, combining theory, practice and updating permanent and include technical, ethical, legal and social impact aspects to promote a culture of knowledge and responsible use of AI in the organisation. Its benefits for the dynamics of public or private activity are wide-ranging.

With regard to its benefits, artificial intelligence (AI) literacy has become a strategic factor in transforming decision-making and promoting innovation in organisations:.

  • By equipping teams with a solid understanding of how AI works and its applications, it facilitates the interpretation of complex data and the use of advanced tools, enabling identification of patterns and anticipation of business-relevant trends .
  • This specialised knowledge contributes to minimising errors and biases, as it promotes decisions based on rigorous analysis rather than intuition, and enables the detection of possible deviations in automated systems. In addition, the automation of routine tasks reduces the likelihood of human failure and frees up resources that can be focused on strategic and creative activities.
  • The integration of AI into the organisational culture drives a mentality oriented towards critical analysis and the questioning of technological recommendations, thus promoting an evidence-based culture. This approach not only strengthens the ability to adapt to technological advances, but also facilitates the detection of opportunities to optimise processes, develop new products and improve operational efficiency.
  • In the legal and ethical sphere, AI literacy helps to manage compliance and reputational risksby fostering transparent and auditable practices that build trust with both society and regulators.
  • Finally, understanding the impact and possibilities of AI diminishes resistance to change and favours the adoption of new technologies, accelerating digital transformation and positioning the organisation as a leader in innovation and adaptation to the challenges of today''s environment.

Good practices for successful AI training

Organisations need to reflect on their training strategy in order to achieve these objectives. In this regard, it seems reasonable to share some lessons learned in the field of data protection. Firstly, it is necessary to point out that all training must start by engaging the organisation''s management team. Reverential fear of the Governing Board, the Local Corporation or the Government of the day should not exist. The political level of any organisation should lead by example if it really wants to permeate all human resources. And this training must be very specific not only from a risk management point of view but also from an opportunity approach based on a culture of responsible innovation.

Similarly, although it may involve additional costs, it is necessary to consider not only the users of AI-based information systems but all staff. This will not only allow us to avoid the risks associated with BYOAI but also to establish a corporate culture that facilitates AI implementation processes.

Finally, it will be essential to adapt training to specific profiles: both users of AI-based systems, technical (IT) staff and ethical and legal mediators and enablers, as well as compliance officers or those responsible for the procurement or tendering of products and services.

Without prejudice to the contents that this type of training should logically include, there are certain values that should inspire training plans. First of all, it is important to remember that this training is compulsory and functionally adapted to the job. Secondly, it must be able to empower people and engage them in the use of AI. The EU''s legal approach is based on the principle of human responsibility and oversight: the human always decides. It must therefore be able to make decisions appropriate to the output provided by the AI, to disagree with the machine''s judgement in an ecosystem that protects it and allows it to report incidents and review them.

Finally, there is one element that cannot be ignored under any circumstances: regardless of whether personal data are processed or not, and regardless of whether AI is intended for humans, its results will always have a direct or indirect impact on individuals or on society. Therefore, the training approach must integrate the ethical, legal and social implications of AI and engage users in guaranteeing fundamental rights and democracy.

Title: Benefits of artificial intelligence literacy  Improves the quality and speed of decisions and efficiency. It reduces bias. Eliminates human error. Encourage a data-driven culture (data-driven). Mitigates legal and ethical risks. It is a key driver of innovation. Source: own elaboration - datos.gob.es

Figure 2. Benefits of artificial intelligence literacy. Source: own elaboration

Good practices for successful AI training

Organisations need to reflect on their training strategy in order to achieve these objectives. In this regard, it seems reasonable to share some lessons learned in the field of data protection. Firstly, it is necessary to point out that all training must start by engaging the organisation''s management team. Reverential fear of the Governing Board, the Local Corporation or the Government of the day should not exist. The political level of any organisation should lead by example if it really wants to permeate all human resources. And this training must be very specific not only from a risk management point of view but also from an opportunity approach based on a culture of responsible innovation.

Similarly, although it may involve additional costs, it is necessary to consider not only the users of AI-based information systems but all staff. This will not only allow us to avoid the risks associated with BYOAI but also to establish a corporate culture that facilitates AI implementation processes.

Finally, it will be essential to adapt training to specific profiles: both users of AI-based systems, technical (IT) staff and ethical and legal mediators and enablers, as well as compliance officers or those responsible for the procurement or tendering of products and services.

Without prejudice to the contents that this type of training should logically include, there are certain values that should inspire training plans. First of all, it is important to remember that this training is compulsory and functionally adapted to the job. Secondly, it must be able to empower people and engage them in the use of AI. The EU''s legal approach is based on the principle of human responsibility and oversight: the human always decides. It must therefore be able to make decisions appropriate to the output provided by the AI, to disagree with the machine''s judgement in an ecosystem that protects it and allows it to report incidents and review them.

Finally, there is one element that cannot be ignored under any circumstances: regardless of whether personal data are processed or not, and regardless of whether AI is intended for humans, its results will always have a direct or indirect impact on individuals or on society. Therefore, the training approach must integrate the ethical, legal and social implications of AI and engage users in guaranteeing fundamental rights and democracy.


Ricard Martínez Martínez, Director of the Microsoft-Universitat de Valencia Chair in Privacy and Digital Transformation

calendar icon
Blog

We live in an increasingly digitalised world where we work, study, inform ourselves and socialise through technologies. In this world, where technology and connectivity have become fundamental pillars of society, digital rights emerge as an essential component to guarantee freedom, privacy and equality in this new online facet of our lives.

Therefore, digital rights are nothing more than the extension of the fundamental rights and freedoms we already benefit from to the virtual environment. In this article we will explore what these rights are, why they are important and what are some of the benchmark initiatives in this area.

What are digital rights and why are they important?

As stated by Antonio Guterres, Secretary-General of the United Nations, during the Internet Governance Forum in 2018:

"Humanity must be at the centre of technological evolution. Technology should not use people; we should use technology for the benefit of all".

Technology should be used to improve our lives, not to dominate them. For this to be possible, as has been the case with other transformative technologies in the past, we need to establish policies that prevent as far as possible the emergence of unintended effects or malicious uses. Therefore, digital rights seek to facilitate a humanist digital transformation, where technological innovation is accompanied by protection for people, through a set of guarantees and freedoms that allow citizens to exercise their fundamental rights also in the digital environment. These include, for example:

  • Freedom of expression: for uncensored communication and exchange of ideas.
  • Right to privacy and data protection: guaranteeing privacy and control over personal information.
  • Access to information and transparency: ensuring that everyone has equal access to digital data and services.
  • Online security: seeks to protect users from fraud, cyber-attacks and other risks in the digital world.

In a digital environment, where information circulates rapidly and technologies are constantly evolving, guaranteeing these rights is crucial to maintaining the integrity of our interactions, the way we access and consume information, and our participation in public life.

An international framework for digital rights

As technology advances, the concept of digital rights has become increasingly important globally in recent decades. While there is no single global charter of digital rights, there are many global and regional initiatives that point in the same direction: the United Nations Universal Declaration of Human Rights. Originally, this declaration did not even mention the Internet, as it was proclaimed in 1948 and did not exist at that time, but today its principles are considered fully applicable to the digital world. Indeed, the international community agrees that the same rights that we proclaim for the offline world must also be respected online - "what is illegal offline must also be illegal online".

Furthermore, the United Nations has stressed that internet access is becoming a basic enabler of other rights, so connectivity should also be considered a new human right of the 21st century.

European and international benchmarking initiatives

In recent years, several initiatives have emerged with the aim of adapting and protecting fundamental rights also in the digital environment. For example, Europe has been a pioneer in establishing an explicit framework of digital principles. In January 2023, the European Union proclaimed the European Declaration on Digital Rights and Principles for the Digital Decade, a document that reflects the European vision of a people-centred technological transformation and sets out a common framework for safeguarding citizens' freedom, security and privacy in the digital age. This declaration, together with other international initiatives, underlines the need to harmonise traditional rights with the challenges and opportunities of the digital environment.

The Declaration, jointly agreed by the European Parliament, the Council and the Commission, defines a set of fundamental principles that should guide Europe's digital age (you can see a summary in this infographic):

  • Focused on people and their rights: Technology must serve people and respect their rights and dignity, not the other way around.
  • Solidarity and inclusion: promoting digital inclusion of all social groups, bridging the digital divide.
  • Freedom of choice: ensure fair and safe online environments, where users have real choice and where net neutrality is respected.
  • Participation in the digital public space: to encourage citizens to participate actively in democratic life at all levels, and to have control over their data.
  • Safety and security: increase trust in digital interactions through greater security, privacy and user control, especially protecting minors.
  • Sustainability: orienting the digital future towards sustainability, considering the environmental impact of technology.

The European Declaration on Digital Rights and Principles therefore sets out a clear roadmap for the European Union's digital laws and policies, guiding its digital transformation process. While this European Declaration does not itself create laws, it does establish a joint political commitment and a roadmap of values. Furthermore, it makes clear that Europe aims to promote these principles as a global standard.

In addition, the European Commission monitors implementation in all Member States and publishes an annual monitoring report, in conjunction with the State of the Digital Decade Report, to assess progress and stay on track. Furthermore, the Declaration serves as a reference in the EU's international relations, promoting a global digital transformation centred on people and human rights.

Outside Europe, several nations have also developed their own digital rights charters, such as the Ibero-American Charter of Principles and Rights in Digital Environments, and there are also international forums such as the Internet Governance Forum which regularly discusses how to protect human rights in cyberspace. The global trend is therefore to recognise that the digital age requires adapting and strengthening existing legal protections, not by creating "new" fundamental rights out of thin air, but by translating existing ones to the new environment.

Spain's Digital Bill of Rights

In line with all these international initiatives, Spain has also taken a decisive step by proposing its own Charter of Digital Rights. This ambitious project aims to define a set of specific principles and guarantees to ensure that all citizens enjoy adequate protection in the digital environment. Its goals include:

  • Define privacy and security standards that respond to the needs of citizens in the digital age.
  • Encourage transparency and accountability in both the public and private sectors.
  • To promote digital inclusion, ensuring equitable access to technologies and information.

In short, this national initiative represents an effort to adapt regulations and public policies to the challenges of the digital world, strengthening citizens' confidence in the use of new technologies. Moreover, since it was published as early as July 2021, it has also contributed to subsequent reflection processes at European level, including the European Declaration mentioned above.

The Spanish Digital Bill of Rights is structured in six broad categories covering the areas of greatest risk and uncertainty in the digital world:

  1. Freedom rights: includes classic freedoms in their digital dimension, such as freedom of expression and information on the Internet, ideological freedom in networks, the right to secrecy of digital communications, as well as the right to pseudonymity.
  2. Equality rights: aimed at avoiding any form of discrimination in the digital environment, including equal access to technology (digital inclusion of the elderly, people with disabilities or in rural areas), and preventing bias or unequal treatment in algorithmic systems.
  3. Participation rights and shaping of public space: this refers to ensuring citizen and democratic participation through digital media. It includes electoral rights in online environments, protection from disinformation and the promotion of diverse and respectful online public debate.
  4. Rights in the work and business environment: encompasses the digital rights of workers and entrepreneurs. A concrete example here is the right to digital disconnection of the worker. It also includes the protection of employee privacy from digital surveillance systems at work and guarantees in teleworking, among others.
  5. Digital rights in specific environments: this addresses particular areas that pose their own challenges, for example the rights of children and adolescents in the digital environment (protection from harmful content, parental control, right to digital education); digital inheritance (what happens to our data and accounts on the Internet after our death); digital identity (being able to manage and protect our online identity); or rights in the emerging world of artificial intelligence, the metaverse and neurotechnologies.
  6. Effectiveness and safeguards: this last category focuses on how to ensure that all these rightsare actually fulfilled. The Charter seeks to ensure that people have clear ways to complain in case of violations of their digital rights and that the authorities have the tools to enforce their rights on the internet.

As the government pointed out in its presentation, the aim is to "reinforce and extend citizens' rights, generate certainty in this new digital reality and increase people's confidence in the face of technological disruption". In other words, no new fundamental rights are created, but emerging areas (such as artificial intelligence or digital identity) are recognised where it is necessary to clarify how existing rights are applied and guaranteed.

The Digital Rights Observatory

The creation of a Digital Rights Observatory in Spain has recently been announced, a strategic tool aimed at continuously monitoring, promoting and evaluating the state and evolution of these rights in the country with the objective of contributing to making them effective. The Observatory is conceived as an open, inclusive and participatory space to bring digital rights closer to citizens, and its main functions include:

  • To push for the implementation of the Digital Bill of Rights, so that the ideas initially set out in 2021 do not remain theoretical, but are translated into concrete actions, laws and effective policies.
  • To monitor compliance with the regulations and recommendations set out in the Digital Bill of Rights.
  • Fighting inequality and discrimination online, helping to reduce digital divides so that technological transformation does not leave vulnerable groups behind.
  • Identify areas for improvement and propose measures for the protection of rights in the digital environment.
  • Detect whether the current legal framework is lagging behind in the face of new challenges from disruptive technologies such as advanced artificial intelligence that pose risks not covered by current laws.
  • Encourage transparency and dialogue between government, institutions and civil society to adapt policies to technological change.

Announced in February 2025, the Observatory is part of the Digital Rights Programme, a public-private initiative led by the Government, with the participation of four ministries, and financed by the European NextGenerationEU funds within the Recovery Plan. This programme involves the collaboration of experts in the field, public institutions, technology companies, universities and civil society organisations. In total more than 150 entities and 360 professionals have been involved in its development.

This Observatory is therefore emerging as an essential resource to ensure that the protection of digital rights is kept up to date and responds effectively to the emerging challenges of the digital age.

Conclusion

Digital rights are a fundamental pillar of 21st century societyand their consolidation is a complex task that requires the coordination of initiatives at international, European and national levels. Initiatives such as the European Digital Rights Declaration and other global efforts have laid the groundwork, but it is the implementation of specific measures such as the Spanish Digital Rights Charter and the new Digital Rights Observatory that will make the difference in ensuring a free, safe and equitable digital environment for all.

In short, the protection of digital rights is not only a legislative necessity, but an indispensable condition for the full exercise of citizenship in an increasingly interconnected world. Active participation and engagement of both citizens and institutions will be key to building a fair and sustainable digital future. If we can realise these rights, the Internet and new technologies will continue to be synonymous with opportunity and freedom, not threat. After all, digital rights are simply our old rights adapted to modern times, and protecting them is the same as protecting ourselves in this new digital age.


Content prepared by Carlos Iglesias, Open data Researcher and consultant, World Wide Web Foundation. The contents and views expressed in this publication are the sole responsibility of the author.

calendar icon
Blog

The Data Governance Act (DGA) is part of a complex web of EU public policy and regulation, the ultimate goal of which is to create a dataset ecosystem that feeds the digital transformation of the Member States and the objectives of the European Digital Decade:

  • A digitally empowered population and highly skilled digital professionals.
  • Secure and sustainable digital infrastructures.
  • Digital transformation of companies.
  • Digitisation of public services.

Public opinion is focusing on artificial intelligence from the point of view of both the opportunities and, above all, the risks and uncertainties. However, the challenge is much more profound as it involves in each of the different layers very diverse technologies, products and services whose common element lies in the need to favour the availability of a high volume of reliable and quality-checked data to support their development.

Promoting the use of data with legislation as leverage

At its inception the Directive 2019/1024 on open data and re-use of public sector information (Open Data Directive), the Directive 95/46/EC on the processing of personal data and on the free movement of such data, and subsequently the Regulation 2016/679 known as the General Data Protection Regulation(GDPR) opted for the re-use of data with full guarantee of rights. However, its interpretation and application generated in practice an effect contrary to its original objectives, clearly swinging towards a restrictive model that may have affected the processes of data generation for its exploitation. The large US platforms, through a strategy of free services - search engines, mobile applications and social networks - in exchange for personal data and with mere consent, obtained the largest volume of personal data in human history, including images, voice and personality profiles.

With the GDPR, the EU wanted to eliminate 28 different ways of applying prohibitions and limitations to the use of data. Regulatory quality certainly improved, although perhaps the results achieved have not been as satisfactory as expected and this is indicated by documents such as the Digital Economy and Society Index (DESI) 2022 or the Draghi Report (The future of European competitiveness-Part A. A competitiveness strategy for Europe).

This has forced a process of legislative re-engineering that expressly and homogeneously defines the rules that make the objectives possible. The reform of the Open Data Directive, the DGA, the Artificial Intelligence Regulation and the future European Health Data Space (EHDS) should be read from at least two perspectives:

  • The first of these is at a high level and its function is aimed at preserving our constitutional values. The regulation adopts an approach focused on risk and on guaranteeing the dignity and rights of individuals, seeking to avoid systemic risks to democracy and fundamental rights.
  • The second is operational, focusing on safe and responsible product development. This strategy is based on the definition of process engineering rules for the design of products and services that make European products a global benchmark for robustness, safety and reliability.

A Practical Guide to the Data Governance Law

Data protection by design and by default, the analysis of risks to fundamental rights, the development process of high-risk artificial intelligence information systems validated by the corresponding bodies or the processes of access and reuse of health data are examples of the legal and technological engineering processes that will govern our digital development. These are not easy procedures to implement. The European Union is therefore making a significant effort to fund projects such as TEHDAS, EUHubs4Data or Quantum , which operate as a testing ground. In parallel, studies are carried out or guides are published, such as the Practical Guide to the Data Governance Law.

This Guide recalls the essential objectives of the DGA:

  • Regulate the re-use of certain publicly owned data subject to the rights of third parties ("protected data", such as personal data or commercially confidential or proprietary data).
  • Boost data sharing by regulating data brokering service providers.
  • Encourage the exchange of data for altruistic purposes.
  • Establish the European Data Innovation Board to facilitate the exchange of best practices.

The DGA promotes the secure re-use of data through various measures and safeguards. These focus on the re-use of data from public sector bodies, data brokering services and data sharing for altruistic purposes.

To which data does it apply? Legitimation for the processing of protected data held by public sector bodies

In the public sector they are protected:

  • Confidential business data, such as trade secrets or know-how.
  • Statistically confidential data.
  • Data protected by the intellectual property rights of third parties.
  • Personal data, insofar as such data do not fall within the scope of the Open Data Directive when irreversible anonymisation is ensured and no special categories of data are concerned.

An essential starting point should be underlined: as far as personal data are concerned, the General Data Protection Regulation (GDPR) and the rules on privacy and electronic communications (Directive 2002/58/EC) also apply. This implies that, in the event of a collision between them and the DGA, the former will prevail.

Moreover, the DGA does not create a right of re-use or a new legal basis within the meaning of the GDPR for the re-use of personal data. This means that Member State or Union law determines whether a specific database or register containing protected data is open for re-use in general. Where such re-use is permitted, it must be carried out in accordance with the conditions laid down in Chapter I of the DGA.

Finally, they are excluded from the scope of the DGA:

  • Data held by public companies, museums, schools and universities.
  • Data protected for reasons of public security, defence or national security.
  • Data held by public sector bodies for purposes other than the performance of their defined public functions.
  • Exchange of data between researchers for non-commercial scientific research purposes.

Conditions for re-use of data

It can be noted that in the area of re-use of public sector data:

▪ The DGA establishes rules for the re-use of protected data, such as personal data, confidential commercial data or statistically sensitive data.

It does not create a general right of re-use, but establishes conditions where national or EU law allows such re-use.

▪ The conditions for access must be transparent, proportionate and objective, and must not be used to restrict competition. The rule mandates the promotion of data access for SMEs and start-ups, and scientific research. Exclusivity agreements for re-use are prohibited, except in specific cases of public interest and for a limited period of time.

Attributes to public sector bodies the duty to ensure the preservation of the protected nature of the data. This will require the deployment of intermediation methodologies and technologies. Anonymisation and access through secure processing environments (Secure processing environments or SPE) can play a key role. The former is a risk elimination factor, while PES can define a processing ecosystem that provides a comprehensive service offering to re-users, from the cataloguing and preparation of datasets to their analysis. The Spanish Data Protection Agency has published an Approach to data spaces from a GDPR perspective that includes recommendations and methodologies in this area.

▪ Re-users are subject to obligations of confidentiality and non-identification of data subjects. In case of re-identification of personal data, the re-user must inform the public sector body and there may be security breach notification obligations.

▪ Insofar as the relationship is established directly between the re-user and the public sector body, there may be cases in which the latter must provide support to the former for the fulfilment of certain duties:

  • To obtain, if necessary, the consent of the persons concerned for the processing of personal data.
  • In case of unauthorised use of non-personal data, the re-user shall inform the legal entities concerned. The public sector body that initially granted the permission for re-use may provide support if necessary.

International transfers of personal data are governed by the GDPR. For international transfers of non-personal data, the re-user is required to inform the public sector body and to contractually commit to ensure data protection. However, this is an open question, since, as with the GDPR, the European Commission has the power to:

1. Propose standard contractual clauses that public sector bodies can use in their transfer contracts with re-users.

2. Where a large number of requests for re-use from specific countries justify it, adopt "equivalence decisions" designating these third countries as providing a level of protection for trade secrets or intellectual property that can be considered equivalent to that provided for in the EU.

3. Adopt the conditions to be applied to transfers of highly sensitive non-personal data, such as health data. In cases where the transfer of such data to third countries poses a risk to EU public policy objectives (in this example, public health) and in order to assist public sector bodies granting permissions for re-use, the Commission will set additional conditions to be met before such data can be transferred to a third country.

▪ Public sector bodies may charge fees for allowing re-use. The DGA's strategy aims at sustainability of the system, as fees should only cover the costs of making data available for re-use, such as the costs of anonymisation or providing a secure processing environment. This would include the costs of processing requests for re-use. Member States must publish a description of the main cost categories and the rules used for their allocation.

▪ Natural or legal persons directly affected by a decision on re-use taken by a public sector body shall have the right to lodge a complaint or to seek a judicial remedy in the Member State of that public sector body.

Organisational support

It is entirely possible that public sector bodies offering intermediation services will multiply. This is a complex environment that will require technical and legal support, backstopping and coordination.

To this end, Member States should designate one or more competent bodies whose role is to support public sector bodies granting re-use. The competent bodies shall have adequate legal, financial, technical and human resources to carry out the tasks assigned to them, including the necessary expertise. They are not supervisory bodies, they do not exercise public powers and, as such, the DGA does not set specific requirements as to their status or legal form. In addition, the competent body may be given a mandate to allow re-use itself.

Finally, States must create a Single Point of Information or one-stop shop. This Point will be responsible for transmitting queries and requests to relevant public sector bodies and for maintaining an asset list with an overview of available data resources (metadata). The single information point may be linked to local, regional or sectoral information points where they exist. At EU level, the Commission created the European Register of Protected Data held by the Public Sector (ERPD), a searchable register of information collected by national single points of information to further facilitate the re-use of data in the internal market and beyond.

EU regulations are rules that are complex to implement. Therefore, a special pro-activity is required to contribute to its correct understanding and implementation. The EU Guide to the Deployment of the Data Governance Act is a first tool for this purpose and will allow a better understanding of the objectives and possibilities offered by the DGA.


Content prepared by Ricard Martínez Martínez, Director of the Chair in Privacy and Digital Transformation, Department of Constitutional Law of the Universitat de València. The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon