Interview with Emilio López Cano, R Hispano
Fecha: 28-04-2021
Nombre: Emilio López Cano
Sector: Science and technology
Organismo, Institución o Empresa: R Hispano

R Hispano is a community of users and developers that was born in 2011, within the III Conference of R Users, with the aim of promoting the advancement of knowledge and the use of the programming language in R. From datos.gob.es we have spoken with them so that they can tell us more about the activities they carry out and the role of open data in them.
Full interview.
1. Can you briefly explain what the R-Hispano Community is?
It is an association created in Spain whose objective is to promote the use of R among a Hispanic audience. There are many R users worldwide and we try to serve as a meeting point for all those whose primary language is Spanish. By having a smaller group within such a large community as a reference, it is easier to build relationships and meet people to turn to when you want to learn more or share what you have learned.
2. R was born as a language linked to the statistical exploitation of data, however, it has become an essential tool of Data Science, why so much acceptance of this language by the community?
It's true that many data science and engineering professionals tend to use more generic languages like Python. However, there are several reasons why R is essential in the "Stack" of teams that work with data. First of all, R has its origin in the S language, which was designed in the 1970s specifically for data analysis, within the Bell Laboratories. This allows people with different computer backgrounds to participate in complex projects, focusing on analysis methods. Second, R has aged very well, and a broad community of users, developers, and businesses contribute to the project with packages and tools that quickly extend functionality to the most innovative methods with (relative) simplicity and rigor.
3. R Hispano works through numerous local initiatives, what advantages does this form of organization bring?
In day-to-day activities, especially when we had face-to-face meetings, more than a year ago, it is more comfortable to coordinate people as closely as possible. It makes no sense for a person in Madrid to organize monthly meetings in Malaga, Seville or the Canary Islands. The interesting thing about these events is to attend regularly, get to know the attendees, understand what the public demands and what can be offered. That, apart from pampering and dedication, requires being close because, otherwise, there is no way to establish that bond. That is why it seemed to us that it is from the cities themselves that this relationship has to be maintained from day to day. On the other hand, it is the way in which the Community of R has been organized around the world, with the success that we all know.
4. Do you consider open data initiatives a valuable source of information for the development of your projects? Any notable reuse examples? What aspects of the current initiatives do you consider could be improved?
The first thing to say is that R Hispano as such has no projects. However, many R Hispano partners work with open data in their professional field, be it academic or business. Of course, it is a very valuable source of information, with many examples, such as the analysis of data from the pandemic that we still suffer, data from sports competitions and athletes' performance, environmental, socio-economic data, ... We cannot highlight any of them because there are many very interesting that would deserve it equally. As for the improvements, there are still many public data repositories that do not publish it in a “treatable” format by analysts. A PDF report can be open data, but it certainly does not contribute to its dissemination, analysis, and exploitation for the good of society.
5. Can you tell us about some of the activities carried out by these local Initiatives?
Several local R groups, both in Spain and Latam, recently collaborated with the technology training company, UTad, in the event “Encounters in the R phase”. Held online for two days. The R user days that we celebrate each year are usually organized by one of the local groups at the headquarters. The Córdoba group is organizing the next ones, postponed due to the pandemic and for which we hope to announce dates soon.
The Madrid R User Group began to function as a local group linked to the Hispanic R Community more than fifteen years ago. Since its origin, it has maintained a monthly periodicity of meetings announced on the social network Meetup (sponsored by RConsortium, entity, founded and subsidized by large companies to favor the use of R). The activity has been interrupted by the limitations of the Covid-19, but all the history of the presentations has been compiled in this portal.
From the Group of R Canarias they have been involved in the conference TabularConf, which took place on January 30, online, with an agenda of a dozen presentations on data science and artificial intelligence. In the past the Canarian group carried out a R user meeting with communications on various topics, including modeling, geographic data processing, as well as queries to public data APIs, such as datos.gob.es, with the library opendataes. Other bookstores presented at a meetup they held in 2020 areistacr or inebaseR, always betting on access to public data.
In the Local Group of Seville, during the hackathons held in recent years they have begun to develop several packages totally linked to open data.
- Air: To get air quality data in Andalusia (works, but needs some adjustments)
- Aemet: R package to interact with the AEMET API (climatic data). We took the first steps in a hackathon, then Manuel Pizarro made a fully functional package.
- Andaclima: Package to obtain climatic data from agroclimatic stations of the Junta de Andalucía
- Data.gob.es.r: Package embryo to interact with http://datos.gob.es. Really just an exploration of ideas, nothing functional for now.
Regarding COVID-19, it is worth highlighting the development by the UCLM, with the collaboration of a former member of the Board of Directors of the R Hispano Community, of this COVID-19 analysis panel, with the cases that the Board of Communities of Castilla-La Mancha presented by municipality. It consists of a interactive tool to consult the information on the incidence and rates per 100,000 inhabitants.
6. In addition, they also collaborate with other groups and initiatives.
Yes, we collaborate with other groups and initiatives focused on data, such as the UNED (Faculty of Sciences), which for a long period of time welcomed us as its permanent headquarters. I would also highlight our performances with:
- Data Journalism Group. Joint filings with the Data Journalism group, sharing the benefits of R for their analysis.
- A collaboration with the Group Machine Learning Spain that resulted in a common presentation in the Google Campus of Madrid.
- With groups of other data languages, such as Python.
- Collaborations with companies. At this point we highlight having participated in two Advanced Analytics events organized by Microsoft, as well as having received small financial aid from companies such as Kabel or Kernel Analytics (recently acquired by Boston Consulting Group).
These are some examples of presentations in the Madrid group based on open data:
-
Madrid Air Quality Analysis (First Y Second version)
- ENRON data analysis
In addition, different partners of R-Hispano also collaborate with academic institutions, in which they teach different courses related to Data analysis, especially promoting the use and analysis of open data, such as the Faculty of Economics of the UNED, the Faculties of Statistics and Tourism and Commerce of the UCM, the University of Castilla-La Mancha, the EOI (specific subject on open data), the Francisco de Vitoria University, the Higher School of Telecommunications Engineering, the ESIC and the K- School.
Finally, we would like to highlight the constant link that is maintained with different relevant entities of the R ecosystem: with R-Consortium (https://www.r-consortium.org/) and RStudio (https://rstudio.com/). It is through the R-Consortium where we have obtained the recognition of the Madrid Group as a stable group and from which we obtain the sponsorship for the payment of Meetup. Within RStudio we maintain different contacts that have also allowed us to obtain sponsorships that have helped in the R Conference, as well as speakers of the stature of Javier Luraschi (author of the package and book on “sparklyr”) or Max Kuhn (author of packages such as "Caret" and its evolution "tidymodels").
7. Through ROpenSpain, some RHispano partners have collaborated in the creation of packages in R that facilitate the use of open data.
ROpenSpain is a community of R, open data and reproducibility enthusiasts who come together and organize to create R packages of the highest quality for the exploitation of Spanish data of general interest. It was born, with the inspiration of ROpenSci, in February 2018 as an organization ofGitHub and has a collaboration channel in Slack. As of January 2021, ROpenSpain groups the following R packages:
- opendataes: Easily interact with the data.gob.es API, which provides data from public administrations throughout Spain.
- MicroData: Allows importing to R various types of INE microdata files: EPA, Census, etc.
- caRtocity: Consult the Cartociudad API, which provides geolocation services, routes, maps, etc.
- Siane: To represent statistical information on the maps of the National Geographic Institute.
- airquality: Air quality data in Spain from 2011 to 2018.
- mapSpain: To load maps of municipalities, provinces and Autonomous Communities. Includes a plugin for leaflet.
- MorbiditySpainR: Read and manipulate data from the Hospital Morbidity Survey
- spanish: For the processing of certain types of Spanish information: numbers, cadastral geocoding, etc.
- BOE: For the processing of the Official State Gazette and the Official Gazette of the Mercantile Registry.
- istacbaser: To consult the API of the Canary Institute of Statistics.
- Cadastre: Consult the Land Registry API.
Some of these packages have been featured at events organized by the R Hispano Community.
8. Finally, how can interested people follow R-Hispano and collaborate with you?
An important element as a link in the entire community of R users in Spanish is the R-Help-es help list:
- Search: https://r-help-es.r-project.narkive.com/;
- Subscription: https://stat.ethz.ch/mailman/listinfo/r-help-es ).
It is one of the few active R-Help lists independent of the main English R-Help that has generated more than 12,800 entries in its more than 12-year history.
In addition, a high level of activity is maintained in social networks that serve as a speaker, a lever through which future events or different news related to data of interest to the community are announced.
We can highlight the following initiatives in each of the platforms:
- Twitter: Presence of the R-Hispano association itself; https://twitter.com/R_Hisp and participation in the hashtag #rstatsES (R in Spanish) of different R collaborators at the national level.
- LinkedIn: In this professional network, "R" has a presence through the company page https://www.linkedin.com/company/comunidad-r-hispano/. In addition, many R-Hispano partners from both Spain and Latam are part of this network, sharing open resources.
- Telegram channel: (https://t.me/rhispano) There is a telegram channel where news of interest to the community is disseminated with certain periodicity
Finally, on the association's website, http://r-es.org, you can find information about the association, as well as how to become a member (the fee is, like R, free)