Interview with Adolfo Antón, data journalist

Fecha: 26-05-2021

Nombre: Adolfo Antón

Sector: Society and welfare

captura entrevista a Adolfo Antón, periodista de datos

Data journalism is one of the disciplines that has grown the most in the media in recent years, both within and outside our borders. This form of journalism consists of collecting accurate data, analysing it and making it available to the public through articles and/or graphic and interactive resources, facilitating the understanding of complex issues. 

In datos.gob.es we have interviewed Adolfo Antón, designer, journalist, trainer and passionate about Free Software and open knowledge. Adolfo has been president of Open Knowledge Foundation Spain, coordinator of School of Data Spain, head of the Datalab at Medialab Prado, coordinator of the Data Journalism working group, of the Data Journalism Conference and curator of the Data Journalism (2014-2019) and Visualizar (2015-2018) workshops. He is currently coordinator of the Master's Degree in Journalism and Data Visualisation at the University of Alcalá and Professor of the Master's Degree in Digital and Data Journalism at the Nebrija University.

Full interview:

1. What does a data journalist do?

Thank you, first of all, for your interest in journalism and data visualisation and for what I can contribute to these fields. I appreciate and welcome the existence of datos.gob.es, an essential project that will soon be ten years old and which, paradoxically, I think is not sufficiently known, used and recognised in these fields. 

To answer your first question, I am going to focus on what you have defined as data journalism, which begins with the collection of accurate data. There are currently many projects in the field of news verification and I see this, in my opinion, as a reaction to an exaggerated proliferation of false, manipulated news, hoaxes, lies and other faunas and floras of bad practices, not only journalistic but also communicative. The data we work with must be verified, certified, accredited and/or curated, providing context, source and methodology. Otherwise, we will develop flawed analysis and distorted stories. 

There is a journalistic saying that goes "don't let a bad piece of information spoil a good headline", and in this temptation it is very important a strong journalistic ethics in journalists, editors and the media itself.

We need to verify, certify, accredit and/or curate the data we work with, providing context, source and methodology. Otherwise, we will develop flawed analyses and falsified stories.

Data journalism is essentially the use of computer applications to work with data, whether it is few, many or very many. Statistics, infographics and data visualisation are also important in data journalism. 

With these IT tools, what Paul Bradshaw called the inverted pyramid of data journalism is realised: 

  • Compile (gather, collect).
  • Cleaning (scrubbing, digging, investigating, interrogating)
  • Contextualise (data and story context, methodology), and
  • Combine (data, visualisations, infographics, maps, texts, interactives...).

It is therefore necessary to use IT tools and computer languages that cover one, several or all of the tasks in the work process. It is not mandatory, but it is advisable not to be cloistered in proprietary software, as this will determine the use we make of it. Third-party services can also be of great help, but it is preferable to use your own services that you have full control over

2. Why is data journalism important, and can you point us to any success stories or investigations where it has been key?

Data journalism is journalism that investigates with data and, therefore, is as important, necessary and primordial as journalism, if by this we mean critical and independent journalism, a fourth power in today's society. Not knowing how to work with data using IT tools limits us from doing good journalism, be it economic, political or sports journalism. Increasingly, data journalism is no longer the exceptional success story of generalist journalism, but the methodology of journalism in general. 

The first cases of success in Spain can be circumscribed around Civio, an organisation that carries out data journalism in projects such as España en llamas, among others. Then, the projects that emerged around the conjunction of three elements that made data journalism grow in Spain are relevant: 

  • The Data Journalism group at Medialab-Prado, which awakens public interest in this discipline and enables the creation of an incipient community; 
  • The Unidad Editorial/URJC Master's Degree in Investigative Journalism, Data and Visualisation, which trains a first generation of data journalists; 
  • The media that are committed to it to a greater or lesser extent, such as El Español, El Mundo, El Confidencial, RTVE, El Diario de Navarra, eldiario.es or Ara.cat.  

A high point in international data journalism, and also in Spain, was undoubtedly the investigation into the Panama Papers by the International Consortium of Investigative Journalists (ICIJ) in 2016, in which 109 media from 76 countries took part. In Spain, LaSexta and El Confidencial were the participating media and achieved a wide repercussion and the resignation of the Minister of Industry, Energy and Tourism.

Unfortunately, the Medialab-Prado Data Lab (the continuation of the data journalism group between 2016 and 2019) no longer exists, nor have all of these media maintained or strengthened their teams. But, in general, the discipline has spread in terms of community, universities and practices. This process has accelerated so much with the COVID crisis19, so that the current period is already considered the second golden age of data visualisation.

3. What challenges does data journalism face today?

It is a difficult question to answer because I believe that in addition to the traditional challenges of journalism, there are also those produced, as I said at the beginning, by the abundance of false, manipulated, biased news disseminated by social networks where issues such as ethics, privacy, authorship, anonymity or the automatic and mass-replicated production of content generates a deafening noise. In addition, intensive polarisation is used to collect data on people in order to create consumer profiles. This bombards the rational, reflective, discursive and greyscale process that good journalism can foster. 

If I focus on data journalism as a methodology, the main challenge I see is to train journalists in the use of computer applications to work with data and thereby, little by little, improve journalistic output so that good data journalism products are valued by the general public.

The main challenge I see is to train journalists in the use of computer applications to work with data and thereby gradually improve journalistic output so that the general public appreciates good journalistic data products.

4. Is there a firm commitment by the traditional media to data journalism?

The Panama Papers were a hopeful moment for data journalism and also for the fact that a generalist television station was committed to this discipline. This has not happened in general terms, but it is true that the coronavirus crisis has produced an increase in work where some analysis and visualisation of data is produced, which can be seen on the front pages of media websites, for example. Without an in-depth analysis, I would say that most of them are more showcases with easy products than complete data journalism works in the sense that not all the stages of the journalistic project are carried out, but fragments that cover the demand. 

It is worth highlighting the work in data analysis and visualisation being done by El País, RTVE.es and eldiario.es. At the same time, media specialised in news verification such as Newtral and Maldita are constantly producing news with innovative formats that also include data analysis and visualisation. 

On the other hand, there are people who do not work in the media but who have come together since the beginning of the pandemic to work on COVID19 data in a commendable effort that combines data collection, analysis and visualisation and leaves the work practically ready for the media to take it and finish it off, but that magical connection has not yet been made. 

From the experience of Medialab-Prado's data journalism workshops, I would say that working with data takes time, requires professionals, equipment, ideas, etc., but these are not investments that can be far from any newsroom, regardless of the size of the media outlet. The fact that such a firm commitment has not been made also leaves the field open for other proposals to position themselves better, as has happened with news verification.

5. Open data is essential for data journalists to have accurate information from official sources. What types of data are most in demand by data journalists for their research?

My impression is journalists do not normally take advantage of the open data that are available, either because it is very complex, because it requires extensive knowledge in data processing, because it requires a lot of work, because it is unknown or, finally, because it is not "attractive", it is not fashionable. 

In other words, having an open data portal and an open data publication policy does not ensure that the data will be used, which does not mean that this, the publication of quality open data, should not be the default policy of any self-respecting public administration and source of information. 

There are many different cases and to cite them all would take a more precise exercise of collecting them. Let us take two examples. INE data, in addition to their complexity, microdata, come in different formats. There are search engines with which to create your own set whose interfaces are very old and not very usable. Another case is the Zaragoza city council data portal. One of the best, but it requires registration to work with the API and the data can be extracted in JSON... I put an ellipsis because although it is one of the most used and manageable data formats, not everyone, as with microdata, knows how to use it. Ultimately, not all the problems in data journalism come from the absence of data but also from the formats and the skills to handle them

Open data is often not exploited, either because it is too complex, because it requires extensive data processing skills, because it requires a lot of work, because it is unknown or, finally, because it is not "attractive", it is not fashionable.

In this sense, that of skills management, I remember that lately I have seen more than one media outlet embedding visualisations made by third parties. It could be one more example of those niches that data journalism is producing so that there are news agencies specialised in data. This should not be a negative thing, but it seems to me that third party tools are being used in limited formats. At the other extreme is the BBC, which makes a style guide on how to make graphics with R and creates a library so that the style of their graphics is different. That's betting on data too. 

In the data journalism or visualisation workshops we always found that we lacked the magic dataset to work with, we had to create it. But we also encountered surprises and I think we certainly don't use most of the available data because we don't know it exists. So, in addition to demanding data, I would push in parallel for learning how to use existing data or create it.

6. How important are visualisations in data reporting? What technologies and tools do you use?

If I go by the usual story, visualisations are used in two main phases:

  • On the one hand, at the data analysis stage. Data is visualised more easily with all kinds of graphical tools or charts that help us to find outliers, patterns, averages, etc. 
  • On the other hand, in the final part of the project, the journalistic product. The visualisation(s) can be just another part or the main piece of the journalistic story.  

Lately I have been trying to explain what data visualisation is in computer terms. In paper format, visualisation is done with manual tools, manual or digital printing. But on the screen, on the Web, you can do any visualisation you want! With characters, text, images, video, audio, interaction, etc. If we understand the language and languages of this medium, we will be able to produce in a more integrated way data journalism works where any element has a place

For this we do not need hardware other than what we already have, desktop or laptop computers, but we do need a compendium of free or open source software tools

It goes without saying that there is a very wide spectrum of possibilities in the field of proprietary and/or proprietary software, but the use of free or open source software is essential to make a leap in the use of technologies in journalism and data visualisation.

If we understand the language and languages of this medium, we will be able to produce in a more integrated way data journalism works where any element has a place.

7. You are currently involved in 2 masters on data journalism, why should journalists have knowledge of data analysis and visualisation?

In addition to the Medialab-Prado experience, I have received or given courses in media and universities. I have been part of the Master in Data Journalism at Centro Universitario Villanueva in its three editions; I made the teaching guide for two modules of the Master in Data Journalism at UNIR and started teaching, although at that time I was not convinced by online training, possibly due to the use and abuse of proprietary software; I have taught a data module in the Master in Agency Journalism at Agencia EFE-UC3M. Now I am teaching in the Data Journalism and Visualisation module of the Master's in Digital and Data Journalism at Nebrija University where I try to transmit this basic knowledge about journalism and data visualisation. 

I never stop learning and practising every day. I have created this Master's Degree in Journalism and Data Visualisation at the University of Alcalá because I understand that there is no training programme that addresses these issues in a comprehensive way, from free or open source software, and because since I started to relate to this world I have seen that data analysis and visualisation are essential for data journalism, but they have not been addressed in this way in the different university programmes. 

It is true that, from the beginning, I have also heard or read that data journalism is collaboration and that there are many profiles in the newsroom and one person can't have them all, and the virtue is cooperation. That is true, but in order to cooperate you have to know how to cooperate on the one hand and know what you want to cooperate about on the other. In classical journalism, cooperation is commonplace - let's hope it is not lost - so all that is missing is the skills. The training that is done, almost always, tends to cater for different profiles, so that you also need to have an overview, to know what others do, what things might interest me, what strengths I should develop or compensate for. And then, having a good base and with practice, to use one or other skills in one or other roles.