Quality Open Data

Fecha de la noticia: 09-11-2017

datos abiertos, calidad, open data

For government data to be easy to use and valuable, it must also be detailed, accurate and high quality. However, both the Web Foundation and Open Knowledge, two of the organizations that actively promote the adoption of open data in the international arena, warn us that according to their latest studies, government data is generally incomplete, outdated, fragmented and low quality.

Everything seems to indicate that governments have in fact been responding to the massive appeal for greater data availability, but sometimes they have done so by creating new barriers to its use, making it difficult to access or using formats and encodings that are ultimately unintelligible. 

All of the above gives rise to the current paradox that there is now more data available than ever before in our history. However, the data is frequently published in such a diverse and complex way that it is often very difficult to establish connections among it, resulting in a new Babel Tower of Data. The principles of open data are also important for the accessibility and usability of the data. They establish a set of qualities that quality data should always comply with, such as:

  • Completeness and comprehensiveness.
  • Punctuality and timeliness.
  • Comparability and interoperability.

However, there are many other aspects that also define the (good or bad) quality of the data, and although they are not directly contemplated in the principles of open data, they should also be taken into account when producing any type of data. Such as, for example, the attributes of inherent quality defined by ISO/IEC 25012, which could be easily adapted since, as we can see, they overlap to some extent: 

  • Accuracy: within the range of valid values defined for the domain of application
  • Completeness: providing the corresponding values for all available attributes
  • Consistency: without contradictions and coherent with the other data in context
  • Credibility: both for the data itself and for the information source
  • Currentness: provided at just the right time to maintain its value
  • Accessibility: ease of access in its context
  • Compliance: with respect to current standards and regulations
  • Confidentiality: respecting the privacy and security of the data
  • Efficiency: so that it can be processed with reasonable resources
  • Precision: regarding the context it belongs to
  • Traceability: regarding the source or origin of the data
  • Understandability: with adequate coding for its subsequent interpretation

In areas such as statistics, where the management of data quality is a long-established culture, there are many references in this regard, such as manuals produced and published by EuroStat or the United Nations, for example. However, data is now a global asset that is no longer only centralized in statistical agencies, but instead is managed throughout all levels and areas of the administration. Therefore, what we really need is that each and every one of the agencies define their own data quality requirements, or adopt other existing ones such as those proposed by the open data project of the European Commission, and that they also start to apply them when publishing open data