Technology

datos.gob.es technology is based on an Open Source architecture which combines different components to meet the functional needs of the project.

The datos.gob.es platform comprises primarily a content management tool, a multi-language module which allows for symmetrical navigation for each of the languages, a data catalog manager, an aggregation engine to integrate all the data catalogs of the publishing entities, a Linked Data API solution and a SPARQL query point.

Architecture diagram

Below is a high-level architecture diagram and a brief description of each of the layers.

Diagrama de arquitectura

  1. Reverse proxy: Varnish is used for the routing to the four services of the project: content portal, data catalog, SPARQL query point and Linked Data API.
  2. Content portal: developed on the content management software Drupal; manages the editorial content of the project.
  3. Data catalog: implemented on CKAN; contains the unique metadata repository; integrates the aggregator which allows synchronization of catalogs for all publishing entities.
  4. Persistence layer: data repository for the content portal and data catalog; implemented on MySQL and PostgreSQL.
  5. Indexing engine: implemented on Apache Solr; provides search capabilities for the content portal as well as the data catalog.
  6. SPARQL query point: implemented on Virtuoso by OpenLink Software as RDF triple storage; feeds the complete graph of the data catalog and features a SPARQL query point. 
  7. Linked Data API: developed on ELDA by Epimorphics; provides a RESTful API, acting as Gateway to the SPARQL query point.

Source code

Solutions implemented for the two most complex technological requirements include:

  • Aggregator: developed as an extension of CKAN, also using as a reference the aggregation engine ckanext-harvest.
  • Multi-language: solution for Drupal which replicates the navigation architecture in all of the languages and allows for a symmetrical navigation of these, ensuring compliance with level AA accessibility.

GitHub LogoThe source code for the project is available on GitHub. All extensions developed for CKAN have been published, along with the Drupal contrib modules, features and theme.

For any questions regarding the platform’s technology, please contact us using the query point.