Open data in real time

Fecha de la noticia: 21-05-2019

datos aeropuerto tiempo real

Our world is dynamic. Things happen when they happen. Our life is in real time and the digital traces that we leave in the form of data are also in real time.

Introduction

We commonly talk about real-time data as those data that are updated or changed with a relatively high frequency. The term real-time data is more controversial than it seems. There is no exact and unambiguous definition of what real-time means. Quantifying this term depends entirely on the use case that we are talking about. For example, VISA company has a processing capacity of more than 24,000 transactions per second and registers an average of 150,000 transactions per day. Therefore, a real-time data processing system for VISA is a system capable of reacting to changes of 24 transactions in a millisecond. However, a real-time system for a weather forecast application (not critical, as opposed to other applications that do depend critically on weather forecasting such as air traffic or sports competitions such as F1), update their data with a frequency of 15-60 minutes. In summary, the definition of real time depends on the use or application case that consumes and uses data.

Let´s see two practical examples of real-time data query and consumption published in open data repositories.

Example of real-time data consumption. Bike-sharing system of Santander city.

In the open data repository datos.gob.es we find many open and reusable datasets. Some of them have a high refresh rate and can be considered real-time data for the purpose of this example.

In this case, we will check data related to the state of public bicycle rental in Santander city. A user might want to consult this data through a mobile or web application to find a free bike near his position. The update frequency of this set is 2 minutes. Through the search engine of datos.gob.es catalogue, we obtain 2 data sets necessary to develop the application that shows the state of free bicycles in real time.

Una vez que estamos listos para acceder a estos dos conjuntos de datos, podemos utilizar nuestro lenguaje de programación favorito (en mi caso R) para leer los datos, procesarlos y combinar ambos conjuntos en uno que proporcione las bicicletas libres en cada estación identificada por su ubicación geográfica dentro de la ciudad. Superponiendo estos datos sobre un mapa obtenido de alguno de los servicios gratuitos y abiertos disponibles en Internet obtenemos nuestro mapa.

Once we are ready to access these two sets of data, we can use our favourite programming language (in my case R) to read and process data, and combine both data sets into one that provides information about free bicycles in each station identified by its geographical location within the city. Superimposing this data on a map – thet we cab get from any of the free and open services available on the Internet-, we obtain our map.

The colour scale indicates the range of free bicycles in each station. Red, means that there are 5 or less free bicycles in that station. Orange colour indicates that there are between 6 and 10 free bikes. Green colour indicates that there are more than 10 free bikes in that station.

Each time this simple program is executed, we obtain the most recent information published by the City Council of Santander.

Example of international real-time data consumption. Air Quality in Europe.

The European Environmental Agency (EEA) makes available to its users all its environmental database. In addition to making data (raw) available in different formats, the Agency also provides graphic tools for visualization and analysis. The following example is an application that shows air quality measurement and control stations on a geographical map. The colour scale shows the concentration of a pollutant in air. In this case, the concentration (in ug/m3) of PM10 (contaminating particles of less than 10 micrometers in diameter) is shown. The data is updated with a frequency of 1h although its publication can accumulate delays of up to several hours.

https://www.eea.europa.eu/data-and-maps/explore-interactive-maps/up-to-date-air-quality-data#tab-based-on-data

The applications that use real-time data are innumerable and do not stop growing as society, public bodies and companies digitize their processes. The digital skills to consult, extract and analyse data in real time are some of the most important. From scientists to journalists, to artists and technologists, none of these professions can dispense with specialists in these matters into their teams.

 


Content prepared by Alejandro Alija, expert in Digital Transformation and innovation.

Contents and points of view expressed in this publication are the exclusive responsibility of its author.