Formulas for accelerating data collaboration
Fecha de la noticia: 24-11-2021

After a period in which efforts were focused on releasing data, mainly from the public sector, in conditions in which it could be reused to create value in its different forms (economic, social, cultural, etc.), we are finding increasing activity around collaboration between organizations to solve big problems using data. There is not yet a consensus around these diverse initiatives, but we are seeing the popularization of a number of concepts among which there are many similarities and which seek to characterize a reality that has been identified as being of great importance for the development of the data strategies of governments and institutions.
For example, the European Commission has recently introduced, along with the concepts of data spaces and data intermediaries, the concept of "data altruism organisation recognised in the Union" in its proposed Data Governance Law. This figure, still to be developed, would be related to the mechanisms envisaged to regulate the altruistic transfer of data and would have the possibility of being registered as a voluntary mechanism to strengthen the confidence of users.
The 2025 Digital Spain plan, although it does not give them a specific name, foresees among its objectives in terms of Data Economy and Artificial Intelligence the creation of strong collaboration mechanisms between the public and private sectors, the public impulse to data sharing and the development of lighthouse projects to use both public and private data for the common good. All this among the different measures to make Spain a reference in the transformation towards a Data Economy.
Data management organizations
Although the term may be new, and has yet to be consolidated, "data altruism organizations" have existed in the public and private sectors and in the third sector for a long time. In the absence of a better name in English, in this article we will refer to "data institutions", as "organisations that steward data on behalf of others, often towards public, educational or charitable aims". This definition is the one proposed by the Open Data Institute (ODI), which is currently leading the most important efforts to characterize these types of organizations and establish a common framework for us to communicate about them.
In the definition, the term "organization" could be interpreted in different ways: foundation, association, institution, public body or similar, since there are multiple formulas that can be adopted according to countries and legal frameworks. On the other hand, in the definition the term "data management" refers to the activities that lead to collecting, maintaining and sharing data and, of course, to determining who has access to the data, how the data is accessed, for what purpose and for whose benefit. Therefore, data management organizations would be placed in the first of the three main activities that create value from raw data and that according to the ODI itself are ODI:
- Manage data: collect, maintain and share data, i.e., the activities of creating datasets, storing, curating or enriching them, and managing governance and access to them.
- Creating information from that data, in the form of products and services, analysis and discovery, or stories and visualizations. In this case we would enter the first layers of data analytics, that is exploratory, descriptive and diagnostic analytics of datasets.
- Deciding what to do, making decisions with information that supports the experience and one's own understanding of the context. In other words, what is characterized in other frameworks as the most complex and valuable phases of data analytics, i.e. predictive and prescriptive analytics.
Data collaborations
GovLab also defines the more well-known figure of "data collaboratives" as a set of new forms of collaboration, beyond the classic public-private partnership model, in which participants from different sectors, companies, research institutions or government agencies, share their data to solve problems of public interest. GovLab also aims to accelerate the creation and use of "data collaborations" in order to harness the potential of data to improve people's lives. An example of such collaborations could be Global Fishing Watch in which Google, Oceana and Sky Truth join data, efforts and resources with the goal of stopping illegal fishing by tracking the movement of more than 35,000 boats.
Some models of collaboration and governance
Regardless of the different nuances in the definitions, and without wishing to be exhaustive, we can identify some patterns in the way in which the different collaboration formulas interact with each other, with people and their rights over data. These patterns do not refer to the business model, but rather to the different ways in which data governance is established:
- Individuals contribute their data to the organization and, on a case-by-case basis, individuals can choose to allow third parties to access that data. An example of this model is HealthBank, which allows individuals to upload their medical records to share them with doctors or "loved ones", or the information banks in Japan that aim to allow users themselves to monetize their data.
- Individuals contribute their data to the organization and, on a case-by-case basis, individuals can choose whether that data is shared with third parties as part of aggregated datasets. Another variant is where decisions about which third parties can access are made collectively. A good example of the latter case would be The Good Data, a kind of cooperative that aims to sell web browsing data generated by its users where they can also participate in deciding the rules.
- The organization provides a platform to collect or create new datasets with the voluntary work of people. In this category we find some well-established platforms such as OpenStreetMap, which collaboratively maintains free maps of the world, or Wikipedia.
- The organization combines or links data from multiple sources and provides information and other services to those who have contributed data. In the maritime sector, HiLo aggregates data generated by about 3,500 ships worldwide to generate risk and safety analyses related to maritime accidents.
- The organization acts as a custodian of data held by other organizations. For example, Harvard University's Social Science One seeks to liberate data generated by Facebook for the public good by making them available for new social science research.
Although the boundaries between the different approaches are in many cases unclear and some of the models may overlap with each other, what seems clear is that new formulas are being explored to allow data to be shared in more flexible and innovative ways, respecting individual autonomy and generating wider societal benefits. Which is excellent news at a time when we cannot forget that the world's largest companies by market capitalization are technology companies offering services based on the collection, use and sharing of data. Thanks to data management organizations, it is possible to move towards a future in which data contributes more decisively to solving humanity's great challenges and not just to creating private value.
Content prepared by Jose Luis Marín, Senior Consultant in Data, Strategy, Innovation & Digitalization.
The contents and views reflected in this publication are the sole responsibility of the author.