Publication date 14/10/2016
Description

The European Commission has taken an important step by completing the DCAT Application Profile (DCAT-AP), a joint initiative of the ISA Programme, the EU Publications Office and DG CONNECT. This specification is an extension of the W3C Data Catalogue Vocabulary (DCAT) and the definition of a regulatory policy for its application in describing public sector datasets in Europe.

The goal of DCAT is quite simple and, for that reason, has been warmly accepted by the open data community since W3C published the DCAT specification as a W3C Recommendation in January 2014. The aim is to provide an RDF vocabulary (a set of classes and properties) designed to describe in a structured manner the content of datasets and data catalogues on the Web. In short, let us imagine an organization that wishes to publish a set of CSV files related to economic indicators for a given topic. Thanks to DCAT, this entity can provide processable descriptions (in RDF), identifying these files as a catalogue (dcat:Catalogue), in which each file in particular is a dataset (dcat:Dataset) whose topic is identified by elements (skos:Concept) of a controlled vocabulary, a thesaurus, a taxonomy, etc.

DCAT-AP’s main contributions can be summarised as follows:

  1. It does not introduce a new vocabulary. On the contrary, its aim is to define in a precise way the use of certain DCAT classes and properties for the publication of data in the European Union. In the extensions needed to describe catalogues and datasets and not present in DCAT, other existing vocabularies are re-used (as in the case of foaf:Document to identify the web portals where catalogues are published).
  2. It defines a complete policy for the use of DCAT-AP, specifying which classes and properties are compulsory, recommended or optional in the application of the vocabulary within the European Union.
  3. It establishes regulatory principles of conformance for publishing and using DCAT-AP documents (section 6 of the specification).
  4. It explains the use of controlled vocabularies (in SKOS) for the description of the topic of datasets, expanding the previous example; of particular importance is the explicit recommendation on reuse of European vocabularies such as Eurovoc. This opens the door to potential applications of DCAT-AP in the field of public procurement.

Additionally, certain new developments in DCAT-AP are taking place:

  1. An extension of DCAT-AP for the exchange of descriptions of geospatial datasets and services: GeoDCAT-AP.  The working group has already published a first version (v1.0), still in the working draft phase. Its main aim is to provide an RDF syntax to combine the metadata framework of the INSPIRE initiative and ISO 19115:2003, in accordance with the conformity principles laid down by DCAT-AP.
  2. An extension of DCAT-AP for publishing statistical datasets: StatDCAT-AP. A working group has been recently created for this purpose. The work in this first phase will concentrate on finding significant common metadata in the different portals that publish statistical data, such as Eurostat. By seeking similarities, StatDCAT-AP aims to be for RDF Data Cube vocabulary the same as the SDMX/EMS metadata framework for the SDMX specification.

Finally, the European Commission has opened a line of work to define the guidelines for DCAT-AP implementation. In this regard, European organizations are invited to participate and share real cases of DCAT-AP applications as well as the problems and challenges faced in their implementation.  This new specification and its early adoption by the European Data Portal Project is a good sign for the Open Data sector. The specification (v1.1) is published within the  JoinUp project.