Data reuse and data governance in the new AI 2024 strategy
Fecha de la noticia: 17-07-2024

The Artificial Intelligence Strategy 2024 is the comprehensive plan that establishes a framework to accelerate the development and expansion of artificial intelligence (AI) in Spain. This strategy was approved, at the proposal of the Ministry for Digital Transformation and the Civil Service, by the Council of Ministers on 14 May 2024 and comes to reinforce and accelerate the National Artificial Intelligence Strategy (ENIA), which began to be deployed in 2020.
The dizzying evolution of the technologies associated with Artificial Intelligence in recent years justifies this reinforcement. For example, according to the AI Index Report for 2024 by Stanford University AI investment has increased nine-fold since 2022. The cost of training models has risen dramatically, but in return AI is driving progress in science, medicine and overall labour productivity in general. For reasons such as these, the aim is to maximise the impact of AI on the economy and to build on the positive elements of ongoing work.
The new strategy is built around three main axes, which will be developed through eight lines of action. These axes are:
- Strengthen the key levers for AI development. This axis focuses on boosting investment in supercomputing, building sustainable storage capacity, developing models and data to form a public AI infrastructure, and fostering AI talent .
- Facilitate the expansion of AI in the public and private sector, fostering innovation and cybersecurity. This axis aims to incorporate AI into government and business processes, with a special emphasis on SMEs, and to develop a robust cybersecurity framework .
- Promote transparent, ethical and humanistic AI. This axis focuses on ensuring that the development and use of AI in Spain is responsible and respectful of human rights, equality, privacy and non-discrimination.
The following infographic summarises the main points of this strategy:
Go to click to enlarge the infographic
Spain's Artificial Intelligence Strategy 2024 is a very ambitious document that seeks to position our country as a leader in Artificial Intelligence, expanding the use of robust and responsible AI throughout the economy and in public administration. This will help to ensure that multiple areas such as culture or the city design can benefit from these developments.
Openness and access to quality data are also critical to the success of this strategy, as it is part of the raw material needed to train and evaluate AI models that are also inclusive and socially just so that they benefit society as a whole. Closely related to open data, the strategy dedicates specific levers to the promotion of AI in the public sector and the development of foundational and specialised corpora and language models . This also includes the development of common services based on AI models and the implementation of a data governance model to ensure the security, quality, interoperability and reuse of the data managed by the General State Administration (AGE, in Spanish acronyms).
The foundational models (Large Language Models or LLMs) are large-scale models that will be trained on large corpora of data in Spanish and co-official languages, thus ensuring their applicability in a wide variety of linguistic and cultural contexts. Smaller, specialised models (Small Language Models or SLMs) will be developed with the aim of addressing specific needs within particular sectors with a lower demand for computational resources.
Common data governance of the AGE
Open data governance will play a crucial role in the realisation of the stated objectives, e.g. to achieve an efficient development of specialised language models. With the aim of encouraging the creation of these models and facilitating the development of applications for the public sphere, the strategy foresees a uniform governance model for data, including the documentary corpus of the General State Administration, ensuring the standards of security, quality, interoperability and reusability of all data.
This initiative includes the creation of a unified data space to exploit sector-specific datasets to solve specific use cases for each agency. Data governance will ensure anonymisation and privacy of information and compliance with applicable regulations throughout the data lifecycle.
A data-driven organisational structure will be developed, with the Directorate-General for Data as the backbone. In addition, the AGE Data Platform, the generation of departmental metadata catalogues, the map of data exchanges and the promotion of interoperability will be promoted. The aim is to facilitate the deployment of higher quality and more useful AI initiatives.
Developing foundational and specialised corpora and language models
Within lever number three, the document recognises that the fundamental basis for training language models is the quantity and quality of available data, as well as the licenses that enable the possibility to use them.
The strategy places special emphasis on the creation of representative and diversified language corpora, including Spanish and co-official languages such as Catalan, Basque, Galician and Valencian. These corpora should not only be extensive, but also reflect the variety and cultural richness of the languages, which will allow for the development of more accurate models adapted to local needs.
To achieve this, collaboration with academic and research institutions as well as industry is envisaged to collect, clean and tag large volumes of textual data. In addition, policies will be implemented to facilitate access to this data through open licences that promote re-use and sharing.
The creation of foundational models focuses on developing artificial intelligence algorithms, trained on the basis of these linguistic corpora that reflect the culture and traditions of our languages. These models will be created in the framework of the ALIA project, extending the work started with the pioneering MarIA, and will be designed to be adaptable to a variety of natural language processing tasks. Priority will also be given, wherever possible, to making these models publicly accessible, allowing their use in both the public and private sectors to generate the maximum possible economic value.
In short, Spain's National Artificial Intelligence Strategy 2024 is an ambitious plan that seeks to position the country as a European leader in the development and use of responsible AI technologies, as well as to ensure that these technological advances are made in a sustainable manner, benefiting society as a whole. The use of open data and public sector data governance also contributes to this strategy, providing fundamental foundations for the development of advanced, ethical and efficient AI models that will improve public services and drive economic growth drive economic growth. And, in short, Spain's competitiveness in a global scenario in which all countries are making a major effort to boost AI and reap these benefits.
Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The contents and points of view reflected in this publication are the sole responsibility of its author.