High value statistical datasets foreseen in the National Statistical Plan 2021-2024

Fecha de la noticia: 20-03-2024

cell phone with a growing graphic in the foreground

The year 2023 was undoubtedly the year of artificial intelligence. This has brought data, and therefore open data, back to the forefront, as it is the raw material that fuels this technology, which is key to value creation in our increasingly digital economy.

Perhaps that is why 2023 has also left us a number of new developments in terms of the drive to open data, many of which could lead to the creation of significant economic and social value through re-use. One of these developments is the  obligation for public sector bodies to open in the first half of 2024 a number of high-value datasets, already specified in a regulation that was published in the last few days of 2022 in order to implement the provisions of open data directive (2019). Specifically, there are six high-value thematic categories: geospatial, earth observation and environmental, meteorology, statistics, companies and company ownership and mobility.

In order to comply with this obligation and with the rest of the obligations set out in Directive 2019/1024, in 2023, Spain has amended the Law 37/2007 on the re-use of public sector information has been amended in Spain in 2023. This amendment emphasises the duty to encourage the openness of high-value data published under an open data attribution licence (CC BY 4.0 or equivalent), in machine-readable format and accompanied by metadata describing the characteristics of the datasets.

The European Statistical System and the National Statistical Plan 2021-2024

Of the six thematic categories, number four, Statistics, is dedicated to statistical datasets, characterised by their broad definition and specification. It is based on the European Statistical System which ensures that European statistics produced in all Member States are reliable, following common criteria and definitions and treating data in an appropriate way, so that they are always comparable between EU countries. Specifically, the regulation defines 21 statistical datasets as high-value  (it actually includes 22, but one of them is redundant as it is broken down into three components: population, fertility and mortality).

The National Statistical Institute] is part of the European Statistical System and is in charge of the production of the harmonised national statistics that Eurostat then compiles and analyses to provide comparable figures, so that Community policies can be defined, implemented and analysed.

In Spain, the National Statistical Plan is the main instrument that organises the statistical activity of the General State Administration, the backbone of statistics for state purposes. The current plan was published at the end of 2020, covering the 2021-2024 period.

The National Statistical Plan 2021-2024 includes new strategic lines such as the use of new sources of information, including, for example, Big Data and massive databases. It also promotes new production models, such as experimental statistics, and incorporates a special focus on the inclusion of gender, disability, age and nationality perspectives, as well as improvements in real estate market information, especially on rentals.

High-value statistical datasets

In these strategic lines, the plan does not yet contain any mention of high-value datasets. However, as the plan is developed and implemented through specific annual programmes detailing the statistical operations to be carried out, their objectives, the bodies involved, and the budget appropriations statistical operations to be carried out, their objectives, the bodies involved and the budget appropriations needed to finance them, it is possible to get an idea of which of these statistical operations are aligned with the 21 categories of high value Ssatistical datasets regulation.

The following table shows the possible equivalences:

High-value statistical datasets

Equivalence in the Inventory of Statistical Operations (IOE)

Industrial production

‍IOE 30050 data sheet, Industrial Production Indices

Industrial producer price index breakdowns by activity 

IOE 30051 data sheet , Industrial Price Indices

Volume of sales by activity 

‍Partially covered by IOE 32092 data sheet Statistics on Sales, Employment and Wages in Large enterprises and SMEs and 32096 data sheet, Daily Domestic Sales.

EU International trade in goods statistics

‍There does not seem to be a clear correspondence in the plan, as the planned statistical operations on international trade are focused on services, while trade in goods is worked out in terms of trade between EU Member States. However, part of the specified data could be found in the IOE 30029 data sheet, Annual National Accounts of Spain: Main Aggregates, although perhaps at a higher level of aggregation than required.

Tourism flows in Europe

‍Many similarities with what is defined in the IOE 16028 data sheet, Statistics on Tourist Movements at Borders (FRONTUR) and  16023, Residents' Tourism Survey (ETR/FAMILITUR).

Harmonised Indices of consumer prices

IOE 30180 data sheet, Harmonised Index of Consumer Prices (HICP).

National Accounts - key indicators on​ GDP

IOE 30029 data sheet, Annual National Accounts of Spain: Main Aggregates.

National accounts - key indicators on corporations

National accounts- key indicators on households​ 

Government expenditure and revenue

‍It is reflected in the three IOE data sheets on the settlement of budgets of the different levels of public administration: 31125 data sheet, Budget Settlement Statistics of the State and its Public Bodies, Companies and Foundations;  31030 datasheet Budgets Settlement of the Autonomous Communities (MHAC); and 31026  Budgets Settlement of Local Entities (MHAC).

Consolidated government gross debt

Environmental accounts and statistics

‍It is reflected in the eight data sheet (from 30084 to 30095) of the inventory of statistical operations regarding Environmental Accounts. View listing here.

Population​ 

IOE 30264 data sheet, Basic Demographic Indicators.

‍Fertility

‍Mortality

IOE 30271 data sheet, Mortality Tables.

‍Current healthcare expenditure

IOE 54012 data sheet, Satellite Accounts of Public Health Expenditure

‍Poverty

IOE 30453 data sheet, Living Conditions Survey (LCS).

Inequality

Employment

There are quite a few statistical operations that study the labor market, of which the IOE 0308 Labor Force Survey stands out.

Unemployment

‍‍

Potential labour force

IOE 30308 data sheet, Labor force Survey, which also contains worksheet 30309 data sheet, Community Labour Force Survey (CLFS).

 

En definitiva, parece que la mayor parte de las variables clave que el reglamento europeo ha previsto para los conjuntos estadísticos de alto valor están ya produciéndose de acuerdo con el plan estadístico nacional vigente. El plan estadístico nacional, que sucederá al actualmente vigente, comenzará en 2025 y a buen seguro se publicará a lo largo de este 2024. Este año veremos en Europa un intenso trabajo para cumplir con las obligaciones del reglamento, ya que, además, la Comisión Europea ha publicado recientemente el informe "Identification of data themes for the extensions of public sector High-Value Datasets" donde se incluyen siete nuevas categorías que se estudia considerar como datos de alto valor y que previsiblemente acabarán siendo incluidas en el reglamento.

All in all, it seems that most of the key variables that the European regulation has foreseen for high value statistical datasets are already being produced according to the existing national statistical plan. The national statistical plan, which will succeed the current one, will start in 2025 and will most likely be published in the course of 2024. This year will see intense work in Europe to comply with the obligations of the regulation, as the European Commission has also recently published the report "Identification of data themes for the extensions of public sector High-Value Datasets" which includes seven new categories that are being considered as high-value datasets and are expected to be included in the regulation and which will foreseeably end up being included in the regulation.


Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization. The contents and views reflected in this publication are the sole responsibility of the author.