Noticia

The promotion of the Data Economy is one of the priorities of the European Union and of our country. Among the EU's goals is to become a leader in a data-driven society, relying on a digital single market where data is shared freely between member countries. To this end, he launched the European Data Strategy, between whose pillars it's found:

  • The development of a governance framework for multisectoral data access and use.
  • The promotion of public-private collaboration.
  • The empowerment of citizens and companies through training and investment.

As a result of this strategy, work is being done, among other things, on a Data Law (Data Act), continuation of the Proposal for a Regulation on data governance.

With this, the EU estimates that the Data Economy will reach a value of 829,000 million euros in 2025 for the 27 member countries and will employ almost 11 million workers.

The strategic context of Spain to promote the Data Economy 

Aligned with the European framework, the Data Economy is included as one of the main axes of Digital Spain 2025, the plan designed to promote digital transformation in our country. Among other issues, the document addresses the need to make Spain a benchmark in the transformation towards a Data Economy, also taking advantage of the opportunities offered by new technologies, such as Artificial Intelligence or cloud services.

One of the measures to achieve these objectives is the start-up of a Data Office, responsible for designing and proposing strategies that promote the sharing, management and use of data throughout all productive sectors of the economy and society, guaranteeing good governance and security.

What are the functions of the Data Office?

Dependent on the Secretary of State for Digitization and Artificial Intelligence, the Data Office seeks to address the main challenges that exist today in the Data Economy, defining the legal and political frameworks for data sharing and governance. In this sense, toIt covers aspects of technology, standards, good practices, governance, encryption, security and privacy related to various fields of action. These domains, although conceptually separate, are very directly interrelated, and materialized within the concept of data space:

Areas Data Office´s Activity: 1. G2G: Development and monitoring of public policies based on data 2. G2C: Open Government and Transparency 3. G2B: Public Administrations as the first producer of data (datos.gob.es) and business innovation based on it (Data Act)4. B2G: Private collaboration in favor of the public interest 5. C2G: Dynamization of the concept of Data Donors 6. B2B: Creation of sectoral industrial spaces to promote the Data Economy

Among other issues, the Data Office is responsible fordesign, coordinate and monitor the "architectural reference model to promote the collection, management and exchange of data". To do this, it is based on the extensive experience already existing in the Administration, optimizing the use of existing resources.

At the head of the Data Office is Alberto Palomo, Chief Data Officer of Spain.

Where to follow the news of the Data Office

Since January, you can follow the news related to the Data Office from the Twitter account data.gob.es - Data Office. It offers news and trends related to innovation based on data and open data: news, information about events, use cases, guides and reports, etc.

In addition, from datos.gob.es we will continue to report news and expand content related to the Data Economy.

calendar icon
Blog

Last December, the Congress of Deputies approved Royal Decree-Law 24/2021, which included the transposition of Directive (EU) 2019/1024 on open data and the reuse of public sector information. This Royal Decree amends Law 37/2007 on the reuse of public sector information, including new requirements for public bodies, including facilitating access to high-value data

High-value data are data whose reuse is associated with considerable benefits to society, the environment and the economy. Initially, the European Commission highlighted as high-value data those belonging to the categories of geospatial, environmental, meteorological, statistical, societal and mobility data, although these classes can be extended both by the Commission and by the Ministry of Economic Affairs and Digital Transformation through the Data Office. According to the Directive, this type of data "shall be made available for reuse in a machine-readable format, through appropriate application programming interfaces and, where appropriate, in the form of bulk download". In other words, among other things, an API is required.

What is an API?

An application programming interface or API is a set of definitions and protocols that enable the exchange of information between systems. It should be noted that there are different types of APIs based on their architecture, communication protocols and operating systems.

APIs offer a number of advantages for developers, since they automate data and metadata consumption, facilitate mass downloading and optimize information retrieval by supporting filtering, sorting and paging functionalities. All of this results in both economic and time savings. 

In this sense, many open data portals in our country already have their own APIs to facilitate access to data and metadata. In the following infographic you can see some examples at national, regional and local level, including information about the API of datos.gob.es. The infographic also includes brief information on what an API is and what is needed to use it.

APIs para el acceso a datos abiertos y/o sus metadatos

Click here to see the infographic in full size and in its accessible version

These examples show the effort that public agencies in our country are making to facilitate access to the information they keep in a more efficient and automated way, in order to promote the reuse of their open data. 

 In datos.gob.es we have a Practical Guide for the publication of open data using APIs where a series of guidelines and good practices are detailed to define and implement this mechanism in an open data portal.


Content prepared by the datos.gob.es team.

calendar icon
Blog

Data has become the present and future, not only of private organizations, but also of the different public administrations. This is why the benefit derived from data sharing between the different administrations is evident. However, it is not uncommon to find that each one of them maintains its own lists and, therefore, inconsistencies inevitably arise. This is why Master Data Management comes in.

What are master data?

Master data are those data that provide a context to transactional data, giving them a functional description and thus converting them into knowledge. For example, if we talk about 37 million, we have a transactional data that provide us with anything by itself. However, if we say that it is the number of people vaccinated and, in addition, we land it in the context of Spain, we can know that almost 80% of the Spanish population is vaccinated, and in this way, we convert raw data into knowledge.

Its main objective is to manage data sharing by reducing the risks associated with data redundancy and thus ensuring data quality.

How does it affect the public sector?

When we talk about data associated with public bodies, especially those made available to citizens and other administrations through open data portals, the need for standardization, i.e. the creation of a single, reliable view of data on citizens, programs, departments, suppliers, employees... becomes more evident.

In this context, one of the missions to be addressed within the Recovery, Transformation and Resilience Plan announced by the Government of Spain for the economic recovery after the pandemic, is the dynamization of data sharing throughout the productive sectors of the economy and society, making data a strategic pillar for the economy. With this objective in mind, the Data Office has been created, located in the State Secretariat for Digitalization and AI, under the Ministry of Economic Affairs and Digital Transformation.

To achieve this goal, open portals for the reuse of public information, such as datos.gob.es, will play a key role. But first, we must ensure that the master data made available has the necessary quality.

Why is it necessary to manage master data?

With the advance of the digital era, data has occupied a fundamental place in society, whose presence is exponential over time. Nowadays, there are countless tools that allow a correct management of them. However, this transformation is being progressive over time. Years ago, public administrations began to compile the data they used in specific systems based on the needs of each of them. With the time, this has meant that the data currently handled comes from different sources and, on occasions, this information is apparently similar. In other words, we can find different sources with the same information on citizens, services, etc., which, when compared with each other, offer ambiguities and data consistency problems. Moreover, this is a problem that does not necessarily arise when comparing different administrations, but also occurs within the same administration for different reasons, such as uncontrolled data migrations between systems, different data sources (surveys, manual registration, registration in a portal, etc.), etc.

In short, we lose confidence in what the data tell us and the direct consequence of this is the loss of value of our information.

In addition, this type of casuistry can generate risks of unmanageable magnitude. It is a great opportunity to carry out fraudulent actions, it can cause unnecessary waste and, therefore, an increase in costs, data leaks, reputational losses, difficulties in complying with regulations such as the General Data Protection Regulation, etc.

How to implement a master data management tool?

To achieve the ultimate goal of data sharing, we must first have a single, reliable view of the data, especially the most critical or priority data, to ensure not only the integrity and consistency of the data, but also the quality and accuracy, by being able to create business rules at a single point of truth.

This process can often be carried out through the following stages:

:

Stages of master data management: Data model management; Data acquisition; Data Validation, standardization and enrichment; Entity resolution; Data sharing & Stewardship.

  • Data model management, by means of documentation that allows locating the different origins for the same information domain.
  • Data acquisition from different sources for centralization of all possible values.
  • Validation, standardization and enrichment of data for the cleaning of the master based on the defined quality rules.
  • Entity resolution in order to determine whether two object references refer to the same object or to two different objects. This is a decision-making stage, generating the process of matching and merging records that allows the construction of the master.
  • Custody of the master and maintenance, as well as sharing with third parties.

In short, starting from the various sources of information, a single record or Golden Record is established on which to apply the rules that enable secure sharing.

In this way, public administrations can make information available to third parties, guaranteeing its quality, as well as take advantage of that of other public bodies, reducing efforts and having agile access to knowledge.

Example: energy efficiency certification

This management acquires greater relevance in public registries, especially in those linked to the General State Administration managed by various administrations, such as, for example, the energy efficiency certification registries.

This is a document that provides objective information on the energy characteristics of buildings, based on an evaluation of various parameters, such as energy consumed or CO2 emissions generated.

However, although the requirements for processing were issued in 2002 by the European Parliament, those responsible for granting energy efficiency labels are the Autonomous Communities, and for this reason, there are disparities depending on the territory for various reasons.

In this case, generating a master that unifies all these data at national level would be very useful, both for the citizens as a whole, as well as for each autonomous region in particular. For this reason, a procedure for the certification of the energy efficiency of buildings has already been launched for 2021.

Asimismo, podrían normalizarse otros datos públicos de la sociedad, estableciendo un único punto de referencia que interconecte los datos de una misma entidad y fomente la compartición de datos a través de ese único punto.

Likewise, other public data of society could be standardized, establishing a single point of reference that interconnects the data of the same entity and promotes data sharing through this single point.

Other areas of potential application can be found in the different ministries, such as tourism or health, adopting all the benefits of centralized management and then adapting it to the view that best suits the requirements or needs of each agency.

In fact, the Government of Spain has already promoted the creation of a national GAIA-X hub to deploy the data economy and bet on the leadership of data spaces, especially in strategic sectors such as tourism and health.

Conclusions

Data sharing in the public sector is a growing trend that is expected to play a key role in the coming years. Proof of this is the effort being made by the Administration, though, among others, the creation of the Data Office and the implementation of the Recovery, Transformation and Resilience Plan.

Therefore, each agency must manage its master data in such a way as to eliminate ambiguities in them. In this way, it will be possible to consolidate data sharing between the different public administrations that will allow actions focused on improving services to citizens, better understanding the needs of society.


Content prepared by Juan Mañes, expert in Data Governance.

The contents and views expressed in this publication are the sole responsibility of the author.

calendar icon
Blog

In the current environment, organisations are trying to improve the exploitation of their data through the use of new technologies, providing the business with additional value and turning data into their main strategic asset.

However, we can only extract the real value of data if it is reliable and for this, the function of Data Governance arises, focused on the efficient management of information assets. Open data cannot be alien to these practices due to its characteristics, mainly of availability and access.

To answer the question of how we should govern data, there are several international methodologies, such as DCAM, MAMD, DGPO or DAMA. In this post, we will base ourselves on the guidelines offered by the latter.

What is DAMA?

DAMA, by its acronym Data Management Association, is an international association for data management professionals. It has a chapter in Spain, DAMA Spain, since March 2019.

Its main mission is to promote and facilitate the development of the data management culture, becoming the reference for organisations and professionals in information management, providing resources, training and knowledge on the subject.

The association is made up of data management professionals from different sectors.

Data governance according to DAMA's reference framework

“A piece of data placed in a context gives rise to information. Add intelligence and you get knowledge that, combined with a good strategy, generates power”

Although it is just a phrase, it perfectly sums up the strategy, the search for power from data. To achieve this, it is necessary to exercise authority, control and shared decision-making (planning, monitoring and implementation) over the management of data assets or, in other words, to apply Data Governance.

DAMA presents what it considers to be the best practices for guaranteeing control over information, regardless of the application business, and to this end, it positions Data Governance as the main activity around which all other activities are managed, such as architecture, interoperability, quality or metadata, as shown in the following figure:

Chart showing Data Governance at the centre and around the other activities: Data Modelling and Design, Data storage and Operations, Data Security, Data Integration and Interoperability, Document and Content Management, Reference & Master Data, Data Warehousing & Business Intelligence, Metadata, Data Quality and Data Architecture.

The Data Government's implementation of open data

Based on the wheel outlined in the previous section, data governance, control, quality, management and knowledge are the key to success and, to this end, the following principles must be complied with:

Available: Information is available to users when they need it; Secure: It complies with internal and external security and privacy policies, allowing access only to authorized users; Consistent: The data is integrity, that is, the information is the same for two different users accessing the same data; Auditable: The data lineage is know from its origin, as well as the usability of each one of them; Accurate: Technical (formats, duplicates, integrity…) and functional (business rules) quality rules are complied with under pre-established standards.

To achieve data compliance with these principles, it will be necessary to establish a data governance strategy, through the implementation of a Data Office capable of defining the policies and procedures that dictate the guidelines for data management. These should include the definition of roles and responsibilities, the relationship model for all of them and how they will be enforced, as well as other data-related initiatives.

In addition to data governance, some of the recommended features of open data management include the following:

  • An architecture capable of ensuring the availability of information on the portal. In this sense, CKAN has become one of the reference architecture for open data. CKAN is a free and open source platform, developed by the Open Knowledge Foundation, which serves to publish and catalogue data collections. This link provides a guide to learn more about how to publish open data with CKAN.
  • The interoperability of data catalogues. Any user can make use of the information through direct download of the data they consider. This highlights the need for easy integration of information, regardless of which open data portal it was obtained from.
  • Recognised standards should be used to promote the interoperability of data and metadata catalogues across Europe, such as the Data Catalogue Vocabulary (DCAT) defined by the W3C and its application profile DCAT-AP.  In Spain, we have the Technical Interoperability Standard (NTI), based on this vocabulary. You can read more about it in this report.
  • The metadata, understood as the data of the data, is one of the fundamental pillars when categorising and labelling the information, which will later be reflected in an agile and simple navigation in the portal for any user. Some of the metadata to be included are the title, the format or the frequency of updating, as shown in the aforementioned NTI.
  • As this information is offered by public administrations for reuse, it is not necessary to comply with strict privacy measures for its exploitation, as it has been previously anonymised. On the contrary, there must be activities to ensure the security of the data. For example, improper or fraudulent use can be prevented by monitoring access and tracking user activity.
  • Furthermore, the information available on the portal will meet the technical and functional quality criteria required by users, guaranteed by the application of quality indicators.
  • Finally, although it is not one of the characteristics of the reference framework as such, DAMA speaks transversally to all of them about data ethics, understood as social responsibility with respect to data processing. There is certain sensitive information whose improper use could have an impact on individuals.

The evolution of Data Government

Due to the financial crisis of 2008, the focus was placed on information management in financial institutions: what information is held, how it is exploited... For this reason, it is currently one of the most regulated sectors, which also makes it one of the most advanced with regard to the applicability of these practices.

However, the rise of new technologies associated with data processing began to change the conception of these management activities. They were no longer seen so much as a mere control of information, but considering data as strategic assets meant great advances in the business.

Thanks to this new concept, private organisations of all kinds have taken an interest in this area and, even in some public bodies, it is not unusual to see how data governance is beginning to be professionalised through initiatives focused on offering citizens a more personalised and efficient service based on data. For example, the city of Edmonton uses this methodology and has been recognised for it.

In this webinar you can learn more about data management in the DAMA framework. You can also watch the video of their annual event where different use cases are explained or follow their blog.

The road to data culture

We are immersed in a globalised digital world that is constantly evolving and data is no stranger to this. New data initiatives are constantly emerging and an efficient data governance capable of responding to these changes is necessary.

Therefore, the path towards a data culture is a reality that all organisations and public bodies must take in the short term. The use of a data governance methodology, such as DAMA's, will undoubtedly be a great support along the way.


Content prepared by David Puig, Graduate in Information and Documentation and head of the Master and Reference Data working group at DAMA SPAIN, and Juan Mañes, expert in Data Governance.

The contents and points of view reflected in this publication are the sole responsibility of its author.

calendar icon
Noticia

The following infographic shows the strategic, regulatory and political situation that will affect the world of open data in Spain and Europe. To deepen its content you can read the following articles:

 

calendar icon
Noticia

Spain already has a new National Artificial Intelligence Strategy. The document, which includes 600 million euros for measures related to artificial intelligence (AI), was presented on December 2 at the Palacio de la Moncloa.

The National Strategy for Artificial Intelligence (known as ENIA) is component 16 of the Plan for the Recovery, Transformation and Resilience of the Spanish economy, and one of the fundamental proposals of the Digital Spain Agenda 2025 in its line 9 of action, which highlights AI as a key element for boosting the growth of our economy in the coming years. In addition, the new strategy is aligned with the European action plans developed in this area, and especially with the White Paper on Artificial Intelligence.

Objectives and lines of action

The ENIA is a dynamic and flexible framework, open to the contribution of companies, citizens, social agents and the rest of the administrations, which was created with 7 objectives: scientific excellence and innovation, the projection of the Spanish language, the creation of qualified employment, the transformation of the Spanish productive fabric, the creation of an environment of trust in relation to AI and the promotion of an inclusive and sustainable AI that takes into account humanist values.

To achieve these objectives, 6 lines of action have been created, which bring together a total of 30 measures to be developed in the period 2020-2025:

In short, the aim is to create a national ecosystem of innovative, competitive and ethical artificial intelligence. And to do this, it is essential to have large volumes of quality and interoperable data and metadata, which are accessible, complete, secure and respectful of privacy.

Open data in the National Strategy of Artificial Intelligence

The availability of open data is essential for the proper functioning of artificial intelligence, since the algorithms must be fed and trained by data whose quality and availability allows continuous improvement. In this way we can create value services that impact on the improvement of society and the economy.

The National Strategy for Artificial Intelligence highlights how, thanks to the various initiatives undertaken in recent years, Spain has become a European benchmark for open data, highlighting the role of the Aporta Initiative in promoting the openness and reuse of public information.

In strategic axis 3 of the document, several key areas are highlighted where to act linked to AI data platforms and technological infrastructures:

  • Developing the regulatory framework for open data, to define a strategy for publication and access to public data from administrations in multilingual formats, and to ensure the correct and safe use of the data.
  • Promote actions in the field of data platforms, models, algorithms, inference engines and cyber security, with the focus on boosting research and innovation. Reference is made to the need to promote Digital Enabling Technologies such as connectivity infrastructures, massive data environments (cloud) or process automation and control, paying special attention to Strategic Supercomputing Capabilities (HPC).
  • Promote the specific development of AI technologies in the field of natural language processing, promoting the use of Spanish in the world. In this sense, the National Plan of Language Technologies will be promoted and the LEIA project, developed by the Royal Spanish Academy for the defense, projection and good use of the Spanish language in the digital universe, will be supported.

In the specific case of open data, one of the first measures highlighted is the creation of the Data Office at the state level that will coordinate all public administrations in order to homogenize the storage, access and processing of data. To strengthen this action, a Chief Data Officer will be appointed.  In addition, a multidisciplinary working group on open data in the state public sector will be set up to highlight the efforts that have been made in the field of data in Spain and to continue promoting the openness and reuse of public sector information.

The strategy also considers the private sector, and highlights the need to promote the development of accessible repositories and to guide companies in the definition of open or shared data strategies. In this sense, shared spaces of sectorial and industrial data will be created, which will facilitate the creation of AI applications. Furthermore, mention is made of the need to offer data disaggregated by sex, age, nationality and territory, in such a way as to eliminate biases linked to these aspects.

In order to stimulate the use and governance of public and citizen data, the creation of the Data for Social Welfare Project is established as an objective, where open and citizen-generated data will play a key role in promoting accountability and public participation in government.

                                                                                                             

Other highlights of the ENIA

In addition to actions related to open data, the National Strategy of Artificial Intelligence includes more transversal actions, for example:

  • The incorporation of AI in the public administration will be promoted, improving from transparency and effective decision-making to productivity and quality of service (making management and the relationship with citizens more efficient). Here the Aporta Initiative has been playing a key role with its support to public sector bodies in the publication of quality data and the promotion of its reuse. Open data repositories will be created to allow optimal access to the information needed to develop new services and applications for the public and private sectors. In this sense, an innovation laboratory (GobTechLab) will be created and training programs will be carried out.
  • The aim is to promote scientific research through the creation of a Spanish Network of Excellence in AI with research and training programs and the setting up of new technological development centers. Special attention will be given to closing the gender gap.
  • A program of aid to companies for the development of AI and data solutions will be launched, and the network of Digital innovation Hubs will be reinforced. A NextTech Fund for public-private venture capital will be created.
  • Talent will be promoted through the National Digital Skills Plan. AI-related elements will be introduced in schools and the university and professional AI training offer will be boosted. The SpAIn Talent Hub program will be developed in coordination with ICEX Invest to attract foreign investment and talent.
  • A novelty of the strategy is that it takes into account ethical and social regulation to fight discrimination. Observatories for ethical and legal evaluation of algorithmic systems will be created and the Digital Rights Charter, currently under revision, will be developed.

In short, we are facing a necessary strategy to boost the growth of AI in Spain, promoting our society and economy, and improving our international competitiveness.

calendar icon