How to implement CKAN: real case of the Aragon Open Data portal
Fecha del documento: 25-11-2020

Aragon Open Data is one of the most active open data initiatives on the Spanish scene. In addition to the implementation, management and maintenance of an interoperable data catalogue, since its inception Aragon Open Data has carried out actions to bring the culture of open data closer to citizens, companies and all types of organisations. These initiatives include the development of services to offer the data and facilitate its reuse in a simple way such as Aragopedia, Open Social Data or the recent Aragón Open Data Focus (more information available in this interview).
Given the knowledge that they hold, it is not surprising that they have begun to develop educational materials and technological articles with the aim of explaining how they have deployed different solutions to respond to the needs of localisation, access and reuse of the different sets of data.
Below is one such material, which focuses on explaining how they have implemented the CKAN software solution to improve the availability of data on the portal.
CKAN as an open data management software solution in a real case for the Aragon Open Data portal
CKAN is a free, open-source platform developed by the Open Knowledge Foundation for publishing and cataloguing data collections. Due to its free and open nature, as well as its rapid implementation, it has become a worldwide reference for the opening of data.
Since its birth in 2012, Aragon Open Data has bet on CKAN technology for the management of its open data system. The document "CKAN, cornerstone for the management of an open data system" shows us how its architecture works and serves as an example for other initiatives that want to implement a platform of this type.
The document describes the challenges they encountered when migrating the original platform to a higher version and how they solved it by building a client application. This process resulted in the current architecture of the portal, which is shown in the figure below:
The CKAN backend is developed entirely in Python, with its own Javascript front end, and allows the deployment of a layer of services that can be managed from an API, and the use of base plugins or extensions that provide additional functionalities to the platform. CKAN is supported by a PostgreSQL database, where the datasets it houses, its resources and other metadata required for the operation of the platform are stored, and makes use of Solr, a search engine that helps to speed up the location and availability of the datasets.
In addition to explaining this architecture, the document discusses the functionalities and extensions used in the customised CKAN instance, and how the set of components integrated into the platform: Angular, NodeJS, PostgreSQL and Solr coexist to provide data sets that are the basis for the development of open data services and solutions such as Presupuestos de Aragón or the already mentioned Aragón Open Data Focus.
CKAN incorporates an extension that supports RFD data serialisation which, in addition to allowing the exposure of linked data in formats such as RDF-XML or Turtle, is used to federate datasets that follow the DCAT specification of metadata, making CKAN a more versatile and appropriate platform for the publication of Linked Data, something that Aragon Open Data has also done as we can see in this other document.
You can download the document "CKAN as an open data management software solution in a real case for the Aragon Open Data portal" below (only available in Spanish). You can also complement your reading with these two additional articles:
- Automatic collection of open data: Explains how Aragon Open Data federates open data using a CKAN plugin.
- ELK architecture as an open data tool: Explains how Aragon Open Data uses an architecture based on the ELK technology stack (Elasticsearch, Logstash and Kibana).
Documentation
- CKAN as an open data management software solution in a real case for the Aragón Open Data portal.pdf371.52 KB