As technology and connectivity have advanced in recent years, we have entered a new era in which data never sleeps and the amount of data circulating is greater than ever. Today, we could say that we live enclosed in a sphere surrounded by data and this has made us more and more dependent on it. On the other hand, we have also gradually become both producers and collectors of data.
The term datasphere has historically been used to define the set of all the information existing in digital spaces, also including other related concepts such as data flows and the platforms involved. But this concept has been developing and gaining more and more relevance in parallel with the growing weight of data in our society today, becoming an important concept in defining the future of the relationship between technology and society.
In the early days of the digital era we could consider that we lived in our own data bubbles that we fed little by little throughout our lives until we ended up totally immersed in the data of the online world, where the distinction between the real and the virtual is increasingly irrelevant. Today we live in a society that is interconnected through data and also through algorithms that link us and establish relationships between us. All that data we share more or less consciously no longer affects only ourselves as individuals, but can also have its effect on the rest of society, even in sometimes totally unpredictable ways - like a digital version of the butterfly effect.
Governance models that are based on working with data and its relationship to people, as if it were simply isolated instances that we can work with individually, will therefore no longer serve us well in this new environment.
The need for a systems-based approach to data
Today, that relatively simple concept of the data sphere has evolved into a complete, highly interconnected and complex digital ecosystem - made up of a wide range of data and technologies - that we inhabit and that affects the way we live our lives. It is a system in which data has value only in the context of its relationship with other data, with people and with the rules that govern those relationships.
Effective management of this new ecosystem will therefore require a better understanding of how the different components of the datasphere relate to each other, how data flows through these components, and what the appropriate rules will be needed to make this interconnected system work.
Data as an active component of the system
In a systems-based approach, data is considered as an active component within the ecosystem. This means that data is no longer just static information, but also has the capacity to influence the functioning of the ecosystem itself and will therefore be an additional component to be considered for the effective management of the ecosystem.
For example, data can be used to fine-tune the functioning of algorithms, improving the accuracy and efficiency of artificial intelligence and machine learning systems. Similarly, it could also be used to adjust the way decisions are made and policies implemented in different sectors, such as healthcare, education and security.
The data sphere and the evolution of data governance
It will therefore be necessary to explore new collective data governance frameworks that consider all elements of the ecosystem in their design, controlling how information is accessed, used and protected across the data sphere.
This could ensure that data is used securely, ethically and responsibly for the whole ecosystem and not just in individual or isolated cases. For example, some of the new data governance tools that have been experimented with for some time now and can help us to manage the data sphere collectively are data commons or digital data assets, data trusts, data cooperatives, data collaboratives or data collaborations, among others.
The future of the data sphere
The data sphere will continue to grow and evolve in the coming years, driven once again by new technological advances and the increasing connectivity and ubiquity of systems. It will be important for governments and organisations to keep abreast of these changes and adapt their data governance and management strategies accordingly through robust regulatory frameworks, accompanied by ethical guidelines and responsible practices that ensure that the benefits that data exploitation promises us can finally be realised while minimising risks.
In order to adequately address these challenges, and thus harness the full potential of the data sphere for positive change and for the common good, it will be essential to move away from thinking of data as something we can treat in isolation and to adopt a systems-based approach that recognises the interconnected nature of data and its impact on society as a whole.
Today, we could consider data spaces, which the European Commission has been developing for some time now as a key part of its new data strategy, as precisely a logical evolution of the data sphere concept adapted to the particular needs of our time and acting on all components of the ecosystem simultaneously: technical, functional, operational, legal and business.
Content prepared by Carlos Iglesias, Open data Researcher and consultant, World Wide Web Foundation.
The contents and views reflected in this publication are the sole responsibility of the author.
The final impact that can be obtained through an open data initiative will ultimately depend on multiple interrelated factors that will be present (or absent) in these initiatives. That is why the GovLab of New York University has analyzed these factors thanks to the study of the several use cases collected by their project about the open data impact throughout the world, even ellaborating a periodic table of the enabling elements of the impact.
These elements have been finally classified into five main categories, reviewing the different sections below.
Definition of the problem and the associated data demand
Obtaining a better anticipated knowledge of the problems we wish to solve and the data demand needed to be solved is a logical first step to obtain the desired impact. The elements that go into action in this category are:
-
In-depth analysis of future users and optimization regarding their needs from the beginning of the project.
-
Definition of causes and context, clearly distinguishing among the causes of the origin of the problems we intend to address and the simple symptoms caused by these same problems.
-
Refinement through the decomposition of the problem in each one of the factors that define it.
-
Definition of the benefits and objectives expected to be carry out the subsequent measurement of their degree of achievement.
-
Audit of the data necessary to carry out the proposed value proposition and inventory of the data actually existing in this regard.
Capacity and civil and institutional culture
The lack of knowledge or of the minimum technological and management capacities could give rise to a barrier difficult to overcome when obtaining the expected impact. The elements that are part of this category include:
- The minimum elements of hardware and software that constitute the data infrastructures necessary to to provide access and enable their use.
- Human capital, public services and elements of civil society that constitute the essential public infrastructure to guarantee the availability of data in a healthy ecosystem.
- The level of digital literacy and the degree of internet penetration necessary to take advantage of the available data.
- Cultural or institutional barriers as regards openness that could act as a brake on the publication or expansion of open data.
- The existence of the necessary technical knowledge and skills to take advantage of the data.
- The feedback channels enabled when collecting the experiences of the users and final beneficiaries of the data.
- Availability of the necessary resources to guarantee the sustainability and availability in the long term of the data already shared.
Data gobernance
The diversity among the different governance models regarding the publication standards and policies is another clearly differentiating variable when talking about impact. The elements that are part of this category include:
- Development of performance metrics that inform the decisions to be taken in the opening projects through the different iterations.
- Control of risks that could affect the privacy of the data or sensitive information to prevent unwanted disclosure.
- Open data by default as a guiding principle of the existing strategy and policies to guarantee political commitment at the highest level.
- Free access to information and other policies that work as necessary pillars on which to build open data projects.
- Measures to ensure a minimum quality of the published data so they are sufficiently precise and updated to be able to take advantage of them.
- Authentic ability to respond to the changing reactions and needs of data users.
Collaboration with other ecosysten agents
Collaborations with all types of organizations and individuals that are part of the data ecosystem play a fundamental role to face a successful open data process. The elements that are part of this category include:
- Establish close connections with data managers, both public and private, is a good strategy to address the gaps in the data with their help.
- Domain experts that provide the specific knowledge required when working in specific and well-defined sectors.
- Collaborations with other individuals and related organizations regarding the opening philosophy.
Risk management
An open data will always be exposed to a certain level of risks that must be identified and adequately addressed. The elements that are part of this category include:
- Privacy problems for which it will be necessary to guarantee the data anonymization against the different techniques of individual identification.
- Non-intrusive data security techniques to protect sensitive information against unwanted exposure but without compromising the opening up of other data.
- Problems in decision making due to being based on incorrect or incomplete information.
- Deepening the power asymmetry in the face of the inability to access data by some marginalized groups for the benefit of a privileged minority.
- Use of open data as a simple image clearing instead of pursuing a true transformative change.
Although there are obviously other contextual variables that will affect our chances of success in each specific case, working on the elements previously seen will undoubtedly have a positive effect on the final impact of our open data initiatives.