Tools for the Management of Controlled Vocabularies with SKOS Suppor

Fecha de la noticia: 19-10-2017

vocabularios controlados, SKOS,

In order to organize (classify, describe, index) knowledge, there are several knowledge organization tools that exist. The following is a summary of them, organized from the simplest (the least formalized and with fewer rules) to the most complex (more formalized and with more rules): controlled vocabularies, taxonomies, thesauri and ontologies.

For the first three simpler options (controlled vocabulary, taxonomy and thesaurus), the W3C consortium has developed a model for representing the basic structure and its content: SKOS, Simple Knowledge Organization System. As a model based on RDF (Resource Description Framework), SKOS allows for the creation and publication of concepts on the Web, in addition to linking them with other concept schemes or with other data, or enabling them to be reused by third parties, which is propitiating its rapid adoption by the Linked Open Data community. In this context, SKOS-XL was later defined, as an extension of SKOS to support the description of lexical entities linked to concepts, that is, with the SKOS-XL extension “more can be said about a lexical label”. For example, the bicycle concept (e.g.:Bicycle) can have a literal and a lexical label (e.g.:Bicycle skos:prefLabel “bicycle”@es). But with SKOS-XL the label can be a URI (e.g.:Bicycle skosxl:prefLabel e.g.:Bicycle_label_EN), which allows the label to be described in a more detailed way, indicating for example:

  • Date of issue (e.g.:Bicycle_Label_EN dct:issued 2013-12-15)
  • Creator of the label (e.g.: Bicycle_Label_EN dc:creator e.g.: ConBici_ Association)

In this context, we will make a list of some of the tools for the management of controlled vocabularies, taxonomies and thesauri, which have support for SKOS (in this review we have left out the tools for visualization and quality control).

iQvoc is a SKOS(-XL) vocabulary manager, available as open source (Apache license Version 2.0). Its main feature is the simplicity of use and the ability to increase its functionality through extensions (such as to support SKOS-XL).        

SKOSEd is an open source plugin for the management of thesauri within the well-known Protégé ontology editor (available as desktop software). As it is within the framework of Protégé, it can be combined with external reasoners that determine if the SKOS model is consistent.

Thmanager is a multiplatform desktop tool (Unix, Windows) used to create and visualize SKOS thesauri, it is open source (under the GNU Lesser General Public License (LGPL)) and is developed by the University of Zaragoza. Its development seems to have been discontinued, in view of the dates of the last changes that were made to it.

PoolParty is a proprietary web manager for thesauri, with support for SKOS. This tool includes text analysis and data linking functionalities. The functionality of the tool is increased through add-ons, such as those offering support for SKOS-XL, support for management flows and approval of concepts based on user roles in the manager. It also provides thesaurus publication as LOD, with dereferenced URIs or a SPARQL endpoint for queries. 

TemaTres is an open source web tool, distributed under the GNU Public License (GPL) for the management and exploitation of controlled vocabularies, thesauri, taxonomies and other models of formal knowledge representation. Internally, it uses a model based on terms, which differs from a model based on concepts, which produces some confusion (even more so when the model is exported in SKOS). There is also some confusion with the multilingual support of thesauri. On the other hand, it has features specially oriented at providing traceability data and quality control for the created models. It is an easy tool to use and is a possible starting point.

TopBraid Enterprise Vocabulary Net (TopBraid EVN) is a proprietary web platform for the creation and management of semantic structures (including vocabularies, taxonomies, thesauri and ontologies). Its version control and traceability system is noteworthy, as well as the native SKOS support.

VocBench is an open source web platform (under the MPL Mozilla Public License) that allows for the collaborative editing of SKOS(XL) multilingual thesauri. This tool is mainly developed by the University of Rome Tor Vergata, and is closely related to the AGROVOC thesaurus [link to the post on AGROVOC], since its management was the reason behind the creation of VocBench. It is possibly one of the most complete tools that exist, with native support to SKOS(XL), management of roles and workflows or multilingualism. A complete new VocBench 3 version is expected in the coming weeks, which we have already seen a few clues about in a webinar by the Agricultural Information Management Standards (AIMS) portal of the FAO.

In the following table, we gather up a few answers for each of the reviewed tools.

When deciding between one system or another, a series of questions about needs and sustainability should be asked:

  • What is your budget?
  • Do you have systems personnel that can deploy a platform or do you have to opt for a service in the Cloud?
  • Does it have technical and non-technical support? Is it a tool with reliable maintenance?
  • Do you want to use open source tools and have your work be based on Open movements?
  • Do you only need a platform for the internal management of the knowledge organization tool or do you also need one to visualize it externally?
  • How much time do you need to manage the knowledge organization tool? Could it be completely generated in a defined timeframe after which it would only be necessary to present it (that is, that a knowledge organization tool has been created without changes)?
  • Do you need to control the versions and/or workflows for the changes made to the knowledge organization tool?
  • Do you need to store “extra” information (SKOS-XL) on the lexical labels associated with the concepts?

To conclude, when choosing a tool for managing a controlled vocabulary, there is no better or worse tool “per se”, instead each case is unique and different from the rest. Following the KISS principle (Keep it Simple, Stupid!), you should select the simplest tool that meets the required needs, and thus avoid any unnecessary complexity.