Noticia

As the year comes to an end, it is the perfect time to pause and reflect on all that we have experienced and shared at Iniciativa Aporta. This year has been full of challenges, learning and achievements that deserve to be celebrated.

One of the milestones we want to share is that we have reached almost two million visits on the platform, which is a 15% growth compared to 2023. The interest in data and related technologies has also been evident in social networks: we have exceeded 14,000 followers on LinkedIn (+56%) and 21,000 on X, the former Twitter (+ 1.5%). In addition, we wanted to reach out to new audiences with the launch of our Instagram and Threads profiles, and the redesign of the YouTube channel.

One of our objectives is to promote the openness of data generated by the public sector so that it can be reused by businesses and citizens. The Aporta Initiative provides qualified technical support to help public bodies overcome their challenges and make quality data available to users, through audits, training sessions and advice.  This work has borne fruit with over 90,000 datasets published in the National Catalogue, 18% more than in 202. These datasets are federated with the European Open Data Portal, data.europa.eu.

But it is not only about publishing data, it is also about promoting its use. In order to promote knowledge about open data and stimulate a market linked to the reuse of public sector information, the Aporta Initiative has developed more than 120 articles, 1,400 tweets and 250 publications on LinkedIn with news, events or analysis of the sector. In this sense, we have tried to gather the latest trends on multiple data-related topics such as artificial intelligence, data spaces or open science. In addition:

  • Spain is among the EU countries setting open data trends by 2024.
  • We have launched a new content format: the pódcasts from datos.gob.es. The aim is to give you the opportunity to learn more about different topics through audio programmes that you can listen to anytime, anywhere.
  • We have strengthened the infographics section, with new content summarising complex data-related issues, such as legislation or strategic documents. Each infographic presents detailed information in a visually appealing way, making it easy to grasp important concepts and allowing you to quickly access key points.
  • We have created new data science exercises, designed to guide you step-by-step through key concepts and various analysis techniques so you can learn effectively and practically. In addition, each exercise includes the full code available on GitHub, allowing you to replicate and experiment on your own.
  • We have published new guides and reports focusing on how to harness the potential of open data to drive innovation and transparency. Each document includes clear explanations and practical examples to keep you up to date with best practices and tools, ensuring that you are always at the forefront in the use of emerging data-related technologies.
  • We have expanded the list of examples of applications and companies that reuse open data. In the case of applications, we have already reached 470 solutions (37 more than in 2023) and in the case of companies, 96 companies (6 more than in 2023)..

Thank you for a good year! In 2025 we will continue to work to drive the data culture in public bodies, businesses and citizens.

You can see more about our activity in the following infographic:

Link to the infographic

Link to the infographic

calendar icon
Noticia

We are in the last days of the year, those hours that we all take advantage of to mentally review what the previous 12 months have given us. At the Aporta Initiative we are no exception and we want to take advantage of the fact that we are just over 72 hours away from eating our grapes to take stock of what we have done and what is yet to come. 

2023 has been a great year for the entire community of data publishers and users. Artificial intelligence has been in the news on multiple occasions, gaining greater prominence not only at the business level: more and more citizens are beginning to understand the challenges and opportunities that lie ahead. In this context, the quantity and quality of available data has become a pressing need, as a driver of increasingly intelligent applications that help us to make progress as a society. 

In this sense, Spain continues to do its homework and reap good results in international balances. In the last month we have known the results of two indexes that place Spain at the top of the openness and reuse of public information: the European Data Portal considers that Spain ranks fourth in terms of open data in the European Union, while the OECD ranks it fifth worldwide. To these must be added the Report on the State of the Digital Decade, whose scope is broader as it includes many other factors that influence digital transformation, and which also places Spain ahead of the average in digital infrastructures and capabilities. 

datos.gob.es consolidates its position as the meeting point for Spain's open data community   

1.700.000. That is the number of visits that datos.gob.es has received during the last year. A 21% more than in the same period of 2022. A figure that highlights the growing interest in open data in our country. This increase has also been reflected in social networks. The Twitter profile of the Aporta Initiative has consolidated its position as a channel for keeping up to date with news and trends related to data-driven innovation, attracting new users who have reported a growth of 6%, to close to 21,000 followers. Meanwhile, the growth of the community of data professionals around datos.gob.es has been reflected on LinkedIn, attracting 51% more users and reaching a total of 9,000. 

This growth is marked by the incessant activity in favor of data sharing, openness and reuse carried out by the Aporta Initiative and reflected in the datos.gob.es platform:  

  •  The number of datasets in the National Data Catalog, hosted at datos.gob.es, has grown by 19%. As of today, users have at their disposal more than 76,000 datasets published by various organizations at national, regional and autonomous community level. Specifically, 77 new publishing organizations have been added. In addition, the datasets already published have been enriched, increasing by 85,000 the available distributions (i.e., the files in various formats in which the data are presented). To ensure its quality, the data.gob.es advisory team has handled more than 600 queries from 140 public institutions. In addition, audits have been carried out, as well as new surveys to promote the opening of new valuable data.   

  • The platform has also continued to publish content prepared by various data experts, including aspects related to trends, regulation, success stories, best practices and technical specifications, among others. Specifically, more than 100 articles have been published, 40 examples of solutions and business models based on data (currently the catalog exceeds 500), as well as a multitude of new practical exercises, guides, reports and audiovisual content, such as infographics and videos. 

New data trends  

2024 looks set to be a very promising year in terms of data-related developments. In recent years we have seen great progress at the regulatory level, with various regulations that promote the opening and sharing of data. The most recent of these are the Data Governance Act (DGA), which became fully applicable in September, and the Data Act (DA), which was passed in November. This growing legal landscape means that during 2024 we face the challenge of achieving harmonized implementation to drive a European Digital Single Market.   

This year will also see a major focus on the drive to build data spaces and developments in high-value data. Regarding the latter, June is the deadline for making available to citizens the data sets considered of high value and detailed in the implementing regulation published a year ago, following a series of technical requirements that facilitate their reuse. In addition, the European Commission is already working on a prospection to see possible categories that could be included as high-value data in the future. 

In short, we are facing an exciting year, which will bring many new developments in the field of data, in order to promote not only the data economy but also to be the driving force behind advances that will have an impact on society as a whole. 

calendar icon
Blog

We live in a constantly evolving environment in which data is growing exponentially and is also a fundamental component of the digital economy. In this context, it is necessary to unlock its potential to maximize its value by creating opportunities for its reuse. However, it is important to bear in mind that this increase in speed, scale and variety of data means that ensuring its quality is more complicated.

In this scenario, the need arises to establish common processes applicable to the data assets of all organizations throughout their lifecycle. All types of institutions must have well-governed, well-managed data with adequate levels of quality, and a common evaluation methodology is needed that can help to continuously improve these processes and allow the maturity of an organization to be evaluated in a standardized way.

The Data Office has sponsored, promoted and participated in the generation of the UNE specifications, normative resources that allow the implementation of common processes in data management and that also provide a reference framework to establish an organizational data culture.

On the one hand, we find the specifications UNE 0077:2023 Data Governance, UNE 0078:2023 Data Management and UNE 0079:2023 Data Quality Management, which are designed to be applied jointly, enabling a solid reference framework that encourages the adoption of sustainable and effective practices around data.

In addition, a common assessment methodology is needed to enable continuous improvement of data governance, management and data quality management processes, as well as the measurement of the maturity of organizations in a standardized way. The UNE 0080 specification has been developed for the development of a homogeneous framework for the evaluation of an organization's treatment of data.

With the aim of offering a process based on international standards that helps organizations to use a quality model and to define appropriate quality characteristics and metrics, the UNE 0081 Data Quality Assessment specification has been generated, which complements the UNE 0079 Data Quality Management.

The following infographic summarizes the key points of the UNE Specifications on data and the main advantages of their application (click on the image to access the infographic).

Accesible version in word

calendar icon
Blog

Today, data quality plays a key role in today's world, where information is a valuable asset. Ensuring that data is accurate, complete and reliable has become essential to the success of organisations, and guarantees the success of informed decision making.

Data quality has a direct impact not only on the exchange and use within each organisation, but also on the sharing of data between different entities, being a key variable in the success of the new paradigm of data spaces. When data is of high quality, it creates an environment conducive to the exchange of accurate and consistent information, enabling organisations to collaborate more effectively, fostering innovation and the joint development of solutions.

Good data quality facilitates the reuse of information in different contexts, generating value beyond the system that creates it. High-quality data are more reliable and accessible, and can be used by multiple systems and applications, which increases their value and usefulness. By significantly reducing the need for constant corrections and adjustments, time and resources are saved, allowing for greater efficiency in the implementation of projects and the creation of new products and services.

Data quality also plays a key role in the advancement of artificial intelligence and machine learning. AI models rely on large volumes of data to produce accurate and reliable results. If the data used is contaminated or of poor quality, the results of AI algorithms will be unreliable or even erroneous. Ensuring data quality is therefore essential to maximise the performance of AI applications, reduce or eliminate biases and realise their full potential.

With the aim of offering a process based on international standards that can help organisations to use a quality model and to define appropriate quality characteristics and metrics, the Data Office has sponsored, promoted and participated in the generation of the specification UNE 0081 Data Quality Assessment that complements the already existing specification UNE 0079 Data Quality Management, focused more on the definition of data quality management processes than on data quality as such.

UNE Specification - Guide to Data Quality Assessment

The UNE 0081 specification, a family of international standards ISO/IEC 25000, makes it possible to know and evaluate the quality of the data of any organisation, making it possible to establish a future plan for its improvement, and even to formally certify its quality. The target audience for this specification, applicable to any type of organisation regardless of size or dedication, will be data quality officers, as well as consultants and auditors who need to carry out an assessment of data sets as part of their functions.

The specification first sets out the data quality model, detailing the quality characteristics that data can have, as well as some applicable metrics, and once this framework is defined, goes on to define the process to be followed to assess the quality of a dataset. Finally, the specification ends by detailing how to interpret the results obtained from the evaluation by showing some concrete examples of application.

Data quality model

The guide proposes a series of quality characteristics following those present in the ISO/IEC 25012 standard , classifying them between those inherent to the data, those dependent on the system where the data is hosted, or those dependent on both circumstances. The choice of these characteristics is justified as they encompass those present in other frameworks such as DAMA, FAIR, EHDS, IA Act and GDPR.

Based on the defined characteristics, the guide uses ISO/IEC 25024 to propose a set of metrics to measure the properties of the characteristics, understanding these properties as "sub-characteristics" of the characteristics.

Thus, as an example, following the dependency scheme, for the specific characteristic of "consistency of data format" its properties and metrics are shown, one of them being detailed

Process for assessing the quality of a data set

For the actual assessment of data quality, the guide proposes to follow the ISO/IEC 25040 standard, which establishes an assessment model that takes into account both the requirements and constraints defined by the organisation, as well as the necessary resources, both material and human. With these requirements, an evaluation plan is established through specific metrics and decision criteria based on business requirements, which allows the correct measurement of properties and characteristics and interpretation of the results.

Below is an outline of the steps in the process and its main activities:

Results of the quality assesment

The outcome of the assessment will depend directly on the requirements set by the organisation and the criteria for compliance. The properties of the characteristics are usually evaluated from 0 to 100 based on the values obtained in the metrics defined for each of them, and the characteristics in turn are evaluated by aggregating the previous ones also from 0 to 100 or by converting them to a discrete value from 1 to 5 (1 poor quality, 5 excellent quality) depending on the calculation and weighting rules that have been established. In the same way that the measurement of the properties is used to obtain the measurement of their characteristics, the same happens with these characteristics, which by means of their weighted sum based on the rules that have been defined (being able to establish more weight to some characteristics than to others), a final result of the quality of the data can be obtained. For example, if we want to calculate the quality of data based on a weighted sum of their intrinsic characteristics, where, because of the type of business, we are interested in giving more weight to accuracy, then we could define a formula such as the following:

Data quality = 0.4*Accuracy + 0.15*Completeness + 0.15*Consistency + 0.15*Credibility + 0.15*Currentness

Assume that each of the quality characteristics has been similarly calculated on the basis of the weighted sum of their properties, resulting in the following values: Accuracy=50%, Completeness=45%, Consistency=35%, Credibility=100% and Currency=50%. This would result in data quality:

Data quality = 0.4*50% + 0.15*45% + 0.15*35% + 0.15*100% + 0.15*50% = 54.5%

Assuming that the organisation has established requirements as shown in the following table:

It could be concluded that the organisation as a whole has a data score of "3= Good Quality".

In summary, the assessment and improvement of the quality of the dataset may be as thorough and rigorous as necessary, and should be carried out in an iterative and constant manner so that the data is continuously increasing in quality, so that a minimum data quality is ensured or can even be certified. This minimum data quality can refer to improving data sets internal to an organisation, i.e. those that the organisation manages and exploits for the operation of its business processes; or it can be used to support the sharing of data sets through the new paradigm of data spaces generating new market opportunities. In the latter case, when an organisation wants to integrate its data into a data space for future brokering, it is desirable to carry out a quality assessment, labelling the dataset appropriately with reference to its quality (perhaps by metadata). Data of proven quality has a different utility and value than data that lacks it, positioning the former in a preferential position in the competitive market.

The content of this guide, as well as the rest of the UNE specifications mentioned, can be viewed freely and free of charge from the AENOR portal through the link below by accessing the purchase section and marking “read” in the dropdown where “pdf” is pre-selected. Access to this family of UNE data specifications is sponsored by the Secretary of State for Digitalization and Artificial Intelligence, Directorate General for Data. Although viewing requires prior registration, a 100% discount on the total price is applied at the time of finalizing the purchase. After finalizing the purchase, the selected standard or standards can be accessed from the customer area in the my products section.

UNE 0081:2023 | AENOR

https://tienda.aenor.com/norma-une-especificacion-une-0080-2023-n0071383

https://tienda.aenor.com/norma-une-especificacion-une-0079-2023-n0071118

https://tienda.aenor.com/norma-une-especificacion-une-0078-2023-n0071117

https://tienda.aenor.com/norma-une-especificacion-une-0077-2023-n0071116

calendar icon
Noticia

There are only a few hours left until the end of 2021, a year marked once again by the health situation, which has highlighted the importance of updated, accurate and complete information for decision making.

With regard to the Aporta Initiative, this year we have focused not only on the opening of data by public bodies, but also on bringing open data closer to citizens, facilitating its reuse by companies, developers or any user who wishes to do so. To this end, we have launched several actions:

In addition, we have continued to bring users the latest developments in the sector. More than 130 new contents have been published in the news, events, innovation blog and interviews sections, and more than 50 new use cases made possible thanks to open data have been made known in the sections on applications and reusing companies.

Another focus has been on listening to and supporting the open data community. The support team has dealt with more than 750 queries from 120 public bodies, which have been helped to resolve their doubts related to open data. We have also approached several data communities, such as R-Hispano, R-Ladies or Hackathon Lovers, to learn first-hand about their work with open data, as well as the challenges and needs they face.

We would like to thank the open data community once again for its support. Thanks to this, during 2021, we have reached 160 publishers, whose efforts have made the National Catalog exceed 50,000 published datasets. In addition, datos.gob.es has received more than one million visits, and datos.gob.es profiles on LinkedIn and Twitter have grown by 9% and 12% respectively. Not to mention that Spain is among the leading countries in open data in Europe for another year, according to the European Data Portal.

Datos.gob.es is the meeting point of the open data ecosystem in Spain, a place where the different voices of public administrations and data users converge. In 2022 we will continue working with all publishers and reusers so that open data continues to advance in Spain.

Happy new year and here's to a successful 2022!

 

Infographic with the balance sheet

 

(You can download the accessible version in word here)

calendar icon