Blog

With 24 official languages and more than 60 regional and minority languages, the European Union is proud of its cultural and linguistic diversity. However, this richness also represents a significant challenge in the digital and technological sphere. Advances in artificial intelligence (AI) and natural language processing have been dominated by English, creating a noticeable imbalance in the availability of language resources for most European languages.

This imbalance has direct consequences, for example:

  • Asymmetric technology development: Companies and researchers have difficulty creating AI solutions adapted to specific languages because resources are limited.
  • Technological dependence: Europe risks becoming dependent on language solutions developed outside its cultural and normative context.

Addressing this gap is not only a matter of inclusion, but also represents a large-scale economic opportunity, capable of generating huge gains in both trade and technological innovation. To address these challenges, the European Commission has launched the European Language Data Space (LDS), a decentralised infrastructure that promotes the secure and controlled exchange of language data among multiple actors in the European ecosystem.

Unlike a simple centralised repository, the LDS functions as a language data marketplace that allows participants to share, sell or license their data under clearly defined conditions and with full control over the use of the data.

The European Language Data Space (LDS), with a beta version operational, represents a decisive step towards democratising language technologies across all languages of the European Union. We tell you the keys to this project and the next steps.

How does this platform work?

LDS is based on a decentralised peer-to-peer (P2P) architecture that allows users to interact directly with each other, without the need for a central server or single authority, where each participant maintains control of its own data. The key elements of LDS operation are:

1. Decentralised and sovereign architecture

Each participant (whether data provider or data consumer) can locally install the LDS Connector, a software that allows interacting directly with other participants without the need for a central server.. This approach ensures:

  •  Data sovereignty: owners retain full control over who can access their data and under what conditions of use.

  • Trust and security: Only eligible and authorised participants, legal entities registered in the EU, can be part of the ecosystem.

  • Interoperability: is compatible with other European data spaces, following common standards.

2. Data exchange flow

The exchange process follows a structured flow between two main actors:

  • The providers describe their linguistic datasets, establish access policies (licences, prices) and publish these offers in the catalogue.
  • The consumers explore the catalogue, identify resources of interest and, through their connectors, initiate negotiations on the terms of use.

If both parties reach an agreement, a contract is established and the data transfer takes place securely between the connectors.

3 Supporting infrastructure

Although the exchange is decentralised, the LDS includes supporting elements such as:

  • Participant registration: ensures that only verified entities participate in the ecosystem.

  • Optional catalogue: facilitates the publication and discovery of available resources

  • Hub of vocabularies: is a service that centralises controlled vocabularies, and allows maintaining lists of values, definitions, relationships between terms, mappers between lists, etc.

  • Monitoring service: allows you to monitor the overall operation of the system.

Added value for the European data ecosystem

The LDS brings significant benefits to the European digital landscape:

  • Boosting multilingual AI

By facilitating access to quality linguistic data in all European languages, the LDS contributes directly to the development of more inclusive AI models adapted to Europe's multilingual reality.  This is especially relevant at a time when large language models (LLMs) are transforming human-machine interaction.

  • Strengthening the data economy

It is estimated that true digital language integration could generate enormous economic benefits in both trade and technological innovation. The LDS creates a marketplace where language data becomes valuable by incentivising its collection, processing and availability under fair and transparent conditions.

  • Preservation of linguistic diversity

By promoting technological development in all European languages, the LDS contributes to preserving and revitalising the continent's linguistic heritage, ensuring that no language is left behind in the digital revolution.

  • The crucial role of industry and public administrations

The success of the LDS depends crucially on the active participation of various actors:

  • Fresh, quality data

The platform seeks to attract especially "fresh" data from the industry (media, publishing, customer services) and the public sector, necessary to train and improve current language models. They are particularly valued:

  • Multimodal data (text, audio, video).

  • Specific content from various professional domains.

  • Up-to-date and relevant language resources.
  • Participation open to all ecosystem actors

The LDS is designed to be inclusive, allowing both private organisations and public entities to participate, as long as they are legal entities registered in the EU. Both types of organisations can act as providers and/or consumers of data.

Participation is formalised through a validation process by the governance board, ensuring that all eligible organisations can benefit from this common language data marketplace.

How can you take part?

The beta version of the LDS is now operational and open to new participants. Organisations interested in participating in this initiative can:

  1. Join the test and focus groups: to contribute to the development and improvement of the platform, here.
  2. Testing the LDS connector: experimenting with the technology in controlled environments.
  3. Provide technical feedback : helping to define key aspects such as metadata, licensing or exchange mechanisms.
  4. Identify relevant data: assessing which language resources could be shared through the platform.

The future of the LDS

While LDS currently focuses on data exchange, its medium-term vision envisages the possibility of integrating language services and AI model hosting within the same ecosystem, thus reinforcing Europe's role in the development of language technologies . A pre-final version of LDS is expected to be available in July 2025 and the finalised version of LDS is expected in January 2026.

All these aspects were discussed at a free online seminar held by the European open data portal "Data spaces: experience from the European Language Data Space".  You can go back to watch the webinar here.

In a global context where technological sovereignty has become a strategic priority, the European Language Data Space represents a decisive step towards ensuring that the AI revolution does not leave Europe's linguistic richness behind.

calendar icon
Blog

The European Union is at the forefront of the development of safe, ethical and people-centred artificial intelligence (AI). Through a robust regulatory framework, based on human rights and fundamental values, the EU is building an AI ecosystem that simultaneously benefits citizens, businesses and public administrations.  As part of its commitment to the proper development of this technology, the European Commission has proposed a set of actions to promote its excellence.

In this regard, a pioneering piece of legislation that establishes a comprehensive legal framework stands out: the AI Act.  It classifies artificial intelligence models according to their level of risk and establishes specific obligations for providers regarding data and data governance. In parallel, the Coordinated Plan on AI updated in 2021 sets out a roadmap to boost investment, harmonise policies and encourage the uptake of AI across the EU.

 Spain is aligned with Europe in this area and therefore has a strategy to accelerate its development and expansion.. In addition, the transposition of the AI law has recently been approved, with the preliminary draft law for an ethical, inclusive and beneficial use of artificial intelligence.

European projects transforming key sectors

In this context, the EU is funding numerous projects that use artificial intelligence technologies to solve challenges in various fields. Below, we highlight some of the most innovative ones, some of which have already been completed and some of which are underway:

Agriculture and food sustainability

Projects currently underway:

  • ANTARES: develops smart sensor technologies and big data to help farmers produce more food in a sustainable way, benefiting society, farm incomes and the environment.

Examples of other completed projects:

  • Pantheon: developed a control and data acquisition system, equivalent to industrial SCADA, for precision farming in large hazelnut orchards, increasing production, reducing chemical inputs and simplifying management.

  • Trimbot2020: researched robotics and vision technologies to create the first outdoor gardening robot, capable of navigating varied terrain and trimming rose bushes, hedges and topiary.

Industry and manufacturing

Projects currently underway:

  • SERENA: applies AI techniques to predict maintenance needs of industrial equipment, reducing costs and time, and improving the productivity of production processes..

  • SecondHands: has developed a robot capable of proactively assisting maintenance technicians by recognising human activity and anticipating their needs, increasing efficiency and productivity in industrial environments.

Examples of other completed projects:

  • QU4LITY: combined data and AI to increase manufacturing sustainability, providing a data-shared, SME-friendly, standardised and transformative zero-defect manufacturing model.

  • KYKLOS 4.0: explored how cyber-physical systems, product lifecycle management, augmented reality and AI can transform circular manufacturing through seven large-scale pilot projects.

Transport and mobility

Projects currently underway

  • VI-DAS: A project by a Spanish company working on advanced driver assistance systems and navigation aids, combining traffic understanding with consideration of the driver's physical, mental and behavioural state to improve road safety.

  • PILOTING: adapts, integrates and demonstrates robotic solutions in an integrated platform for the inspection and maintenance of refineries, bridges and tunnels.. One of its focuses is on boosting production and access to inspection data.

Examples of other completed projects:

  • FABULOS: has developed and tested a local public transport system using autonomous minibuses, demonstrating its viability and promoting the introduction of robotic technologies in public infrastructure.

Social impact research

Projects currently underway:

  • HUMAINT: provides a multidisciplinary understanding of the current state and future evolution of machine intelligence and its potential impact on human behaviour, focusing on cognitive and socio-emotional capabilities.

  • AI Watch: monitors industrial, technological and research capacity, policy initiatives in Member States, AI adoption and technical developments, and their impact on the economy, society and public services.

Examples of other completed projects:

  • TECHNEQUALITY: examined the potential social consequences of the digital age, looking at how AI and robots affect work and how automation may impact various social groups differently.

Health and well-being

Projects currently underway:

  • DeepHealth: develops advanced tools for medical image processing and predictive modelling, facilitating the daily work of healthcare personnel without the need to combine multiple tools..

  • BigO: collects and analyses anonymised data on child behaviour patterns and their environment to extract evidence on local factors involved in childhood obesity.

Examples of other completed projects:

  • PRIMAGE: has created a cloud-based platform to support decision making for malignant solid tumours, offering predictive tools for diagnosis, prognosis and monitoring, using imaging biomarkers and simulation of tumour growth..

  • SelfBACK: provided personalised support to patients with low back pain through a mobile app, using sensor-collected data to tailor recommendations to each user.

  • EYE-RISK: developed tools that predict the likelihood of developing age-related eye diseases and measures to reduce this risk, including a diagnostic panel to assess genetic predisposition.

  • Solve-RD: improved diagnosis of rare diseases by pooling patient data and advanced genetic methods.

The future of AI in Europe

These examples, both past and present, are very interesting use cases of the development of artificial intelligence in Europe. However, the EU's commitment to AI is also forward-looking. And it is reflected in an ambitious investment plan: the Commission plans to invest EUR 1 billion per year in AI, from the Digital Europe and Horizon Europe programmes, with the aim of attracting more than EUR 20 billion of total AI investment per year during this decade..

The development of an ethical, transparent and people-centred IA is already an EU objective that goes beyond the legal framework. With a hands-on approach, the European Union funds projects that not only drive technological innovation, but also address key societal challenges, from health to climate change, building a more sustainable, inclusive and prosperous future for all European citizens.

calendar icon
Blog

The European Green Deal (Green Deal) is the European Union's (EU) sustainable growth strategy, designed to drive a green transition that transforms Europe into a just and prosperous society with a modern and competitive economy. Within this strategy, initiatives such as Target 55 (Fit for 55), which aims to reduce EU emissions by at least 55% by 2030, stand out, and the Nature Restoration Regulation(, which sets binding targets to restore ecosystems, habitats and species.

 The European Data Strategy positions the EU as a leader in data-driven economies, promoting fundamental values such as privacy and sustainability.  This strategy envisages the creation of data spaces sectoral spaces to encourage the availability and sharing of data, promoting its re-use for the benefit of society and various sectors, including the environment.

This article looks at how environmental data spaces, driven by the European Data Strategy, play a key role in achieving the goals of the European Green Pact by fostering the innovative and collaborative use of data.

Green Pact data space from the European Data Strategy

In this context, the EU is promoting the Green Deal Data Space, designed to support the objectives of the Green Deal through the use of data. This data space will allow sharing data and using its full potential to address key environmental challenges in several areas: preservation of biodiversity, sustainable water management, the fight against climate change and the efficient use of natural resources, among others.

In this regard, the European Data Strategy highlights two initiatives:

  • On the one hand, the GreenData4all initiative which carries out an update of the INSPIRE directive to enable greater exchange of environmental geospatial data between the public and private sectors, and their effective re-use, including open access to the general public.
  •  On the other hand, the Destination Earth project proposes the creation of a digital twin of the Earth, using, among others, satellite data, which will allow the simulation of scenarios related to climate change, the management of natural resources and the prevention of natural disasters.

Preparatory actions for the development of the Green Pact data space

As part of its strategy for funding preparatory actions for the development of data spaces, the EU is funding the GREAT project (The Green Deal Data Space Foundation and its Community of Practice). This project focuses on laying the foundations for the development of the Green Deal data space through three strategic use cases: climate change mitigation and adaptation, zero pollution and biodiversity. A key aspect of GREAT is the identification and definition of a prioritised set of high-value environmental data (minimum but scalable set).  This approach directly connects this project to the concept of high-value data defined in the European Open Data Directive (i.e. data whose re-use generates not only a positive economic impact, but also social and environmental benefits)..  The high-value data defined in the Implementing Regulation include data related to Earth observation and the environment, including data obtained from satellites, ground sensors and in situ data.. These packages cover issues such as air quality, climate, emissions, biodiversity, noise, waste and water, all of which are related to the European Green Pact.

Differentiating aspects of the Green Pact data space

At this point, three differentiating aspects of the Green Pact data space can be highlighted.

  • Firstly, its clearly multi-sectoral nature requires consideration of data from a wide variety of domains, each with their own specific regulatory frameworks and models.
  • Secondly, its development is deeply linked to the territory, which implies the need to adopt a bottom-up approach (bottom-up) starting from concrete and local scenarios.
  • Finally, it includes high-value data, which highlights the importance of active involvement of public administrations, as well as the collaboration of the private and third sectors to ensure its success and sustainability.

Therefore, the potential of environmental data will be significantly increased through European data spaces that are multi-sectoral, territorialised and with strong public sector involvement.

Development of environmental data spaces in HORIZON programme

In order to develop environmental data spaces taking into account the above considerations of both the European Data Strategy and the preparatory actions under the Horizon Europe (HORIZON) programme, the EU is funding four projects:

  • Urban Data Spaces for Green dEal (USAGE).. This project develops solutions to ensure that environmental data at the local level is useful for mitigating the effects of climate change. This includes the development of mechanisms to enable cities to generate data that meets the FAIR principles (Findable, Accessible, Interoperable, Reusable) enabling its use for environmentally informed decision-making.
  • All Data for Green Deal (AD4GD).. This project aims to propose a set of mechanisms to ensure that biodiversity, water quality and air quality data comply with the FAIR principles. They consider data from a variety of sources (satellite remote sensing, observation networks in situ, IoT-connected sensors, citizen science or socio-economic data).
  • F.A.I.R. information cube (FAIRiCUBE). The purpose of this project is to create a platform that enables the reuse of biodiversity and climate data through the use of machine learning techniques. The aim is to enable public institutions that currently do not have easy access to these resources to improve their environmental policies and evidence-based decision-making (e.g. for the adaptation of cities to climate change).
  • Biodiversity Building Blocks for Policy (B-Cubed).. This project aims to transform biodiversity monitoring into an agile process that generates more interoperable data. Biodiversity data from different sources, such as citizen science, museums, herbaria or research, are considered; as well as their consumption through business intelligence models, such as OLAP cubes, for informed decision-making in the generation of adequate public policies to counteract the global biodiversity crisis.

Environmental data spaces and research data

Finally, one source of data that can play a crucial role in achieving the objectives of the European Green Pact is scientific data emanating from research results.  In this context, the European Union's European Open Science Cloud (EOSC) initiativeis an essential tool. EOSC is an open, federated digital infrastructure designed to provide the European scientific community with access to high quality scientific data and services, i.e. a true research data space. This initiative aims to facilitate interoperability and data exchange in all fields of research by promoting the adoption of FAIR principles, and its federation with the Green Pact data space is therefore essential.

Conclusions

Environmental data is key to meeting the objectives of the European Green Pact. To encourage the availability and sharing of this data, promoting its re-use, the EU is developing a series of environmental data space projects. Once in place, these data spaces will facilitate more efficient and sustainable management of natural resources, through active collaboration between all stakeholders (both public and private), driving Europe's ecological transition.


Jose Norberto Mazón, Professor of Computer Languages and Systems at the University of Alicante. The contents and views reflected in this publication are the sole responsibility of the author.

calendar icon
Blog

The adoption of the Regulation (EU) of the European Parliament and of the Council of 13 December 2023 on harmonised rules for fair access to and use of data (Data Law) is an important step forward in the regulation of the European Union to facilitate data accessibility. This is an initiative already included in the European Data Strategy , the main aims of which are:

  • Regulate the provision of data topublic entities in exceptional situations.
  • Promote the development of interoperability criteria for data spaces, data processing services and smart contracts.
  • And, from the perspective that interests us now, to promote the provision of the data generated by connected products and services, either to those who use them or to the third parties they indicate.

In this respect, in view of users' difficulties in accessing data, the Regulation seeks to facilitate their free choice of providers of repair and other services, as it has been found that in many areas manufacturers try to reserve their use on an exclusive basis. Among other issues, it is intended to promote the user's right to decide for what purposes and by whom the data may be used, without prejudice to the existence of a series of limitations and conditions that are provided for in the Regulation itself.

A major shift in regulatory focus

While the Open Data and Re-use of Public Sector Information Directive and the Data Governance Regulation focus on establishing rules and safeguards to promote access to data held by public bodies, the new regulation pays special attention to relations between private parties. In other words, it allows public bodies to demand data from certain private subjects under exceptional conditions and for reasons of public interest.

One of the main objectives of the Data Regulation is to encourage not only "the development of new and innovative connected products or related services and to stimulate innovation in the aftermarkets, but also to stimulate the development of entirely new services using the data inquestion, including those based on data from a variety of connected products or related services".

To this end, it has been considered essential to establish clear and precise obligations  for manufacturers of connected products, suppliers of connected products and related service providers to share the data generated with users.

What obligations are in place?

Prior to contracting the products and services, the owner of the data - i.e. the supplier of the product or service, which may also be the manufacturer -‑‑, shall provide the user with information on:

  • The amount and conditions of the data that can be generated
  • How this data can be accessed 
  • How they can be suppressed

In this respect, the design of products and services is required to take appropriate measures to ensure that, by default, data are accessible, free of charge and directly, in particular in a structured, machine-readable format.

However, this right is subject to certain conditions and limitations in order to ensure that other legal interests and interests are not affected:

  • The data subject may not make it difficult for the user to access his or her data, but may require the user to identify himself or herself, even if he or she is prohibited from keeping the information generated indefinitely.
  • It may establish restrictions in the contract when, as a result of the user's access to the data, there is a risk to the functioning of the product that may affect the health or safetyof persons.
  • Under no circumstances may you use the data obtained during the use of the product or the provision of the service to make them available to a third party, unless it is strictly essential for the fulfilment of the contract.
  • It is also expressly forbidden to use the data to make enquiries about the user's circumstances and activity, such as, for example, the user's financial situation.

For his part, the user is also subject to a number of obligations specifically aimed at ensuring the good faith of his legal relationship with the holder:

  • You are not allowed to use the data to compete with the latter, either directly or through a third party to whom you may provide it,
  • You may not use access to them to make enquiries about the activity of the manufacturer of the product or, where applicable, of the data subject.
  • In addition to these obligations, you have the right to share the data with a third party, who may only use it for the purposes for which you authorise them to do so. In particular, it may not create profiles unless this is necessary to provide the service, make them available to another party or develop a product that competes with the one from which the data originally originated.

In any case, the regulation establishes an important limitation to be taken into account by users, as micro and small enterprises are excluded from this regime. With one exception: they have been commissioned to develop the product or provide the service by a subject that falls within the scope of the Regulation.

what safeguards are in place to ensure the effectiveness of this regulation?

As is generally the case in any area, the user may bring the matter before a judicial body to enforce his or her rights. In addition, the new regulation establishes the possibility of approaching the designated authority at State level to ensure the application and enforcement of the provisions of the Regulation. If the problem concerns the processing of personal data, you may also exercise your rights before the competent authority in this area.

In this respect, the European Commission will have to make public a list of the relevant authorities on the basis of the information provided by the States. They may designate more than one authority, indicating which one has the coordinating role. These authorities shall have sufficient means: their members shall have the expertise required for the performance of their duties and their impartiality shall be guaranteed, so that they may not receive instructions from other entities.

Apart from this channel, the data subject and the user - or, where appropriate, the third party to whom the user permits the use of the data - may voluntarily agree to submit to a certified dispute resolution body, whose decision must be taken within a maximum of 90 days. Such a body shall be accredited to the State where it is established. To this end, he or she must justify his or her impartiality, capacity and independence. It must also demonstrate that it has adequate procedural rules and that it is easily accessible by electronic means.

In short, the new Data Law has not only established a regulatory framework that reinforces users' access to the data generated by the connected products they acquire and the related services they enjoy, but it has also enshrined a series of guarantees specifically aimed at ensuring effective compliance.

infographic Data Law

Download the infographic in PDF here

This infographic is also available in two pages


Content prepared by Julián Valero, Professor at the University of Murcia and Coordinator of the Research Group "Innovation, Law and Technology" (iDerTec). The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon