Mercè Crosas vividly described what needs to be considered to create a functional, just, and successful world of data sharing – in Barcelona at the Data Spaces Discovery Day. Three prerequisites to consider when sharing data among different stakeholders.
1. The importance of co-creation within data spaces
Mercè Crosas emphasized that a lot has been learned about co-creation from the COVID crisis. A common purpose is the strongest driver for collaboration – the goal of finding solutions together. Joint innovation efforts of research, industry, governments – and very importantly also of civil society produce best results. Government involvement as a network builder can help speed up solutions, linking research and industry. Additionally, user engagement can provide important feedback.
2. The importance of FAIR data
The FAIR guiding principles for data – findable, accessible, interoperable, and reusable – are central to enable secure & effective sharing. Findable here means that the data can be found by humans and by machines. A global unique identifier is assigned to be able to search for it. To be accessible, we need to use open and free protocols to retrieve digital resources. The hardest one as well as the most important one is interoperability. It allows data sets to merge, resulting in richer information. Standards are needed to make this possible. And we all want data to be reusable as widely and often as possible to create more value. What are the terms to access and use it and who to credit? It is important to recognize all individual contributions, and this way provide an additional incentive for sharing and collaboration.
How do we implement these principles when building data spaces? The FAIR principles mirror the efforts of the IDSA to ensure sovereign data sharing in a trusted environment.
3. The importance of providing levels of openness
How can we make data more accessible while ensuring its privacy and sensitivity? Defining different degrees of openness for datasets helps organize the various requirements. From completely open (public; just download what you want, no registration or tracking), to mainly open (only registration required to understand who is using the data) to different levels of restriction (for sensitive and protected data): Researchers at Harvard proposed and developed a set of six data tags of security features and access requirements – with increasing levels of restriction to support security for sensitive data where necessary.
Let’s summarize: Today’s complex problems cannot be solved alone. We want to maximize collaboration or co-creation while incentivizing competition of all the individual contributors. We want to maximize data sharing and openness, but with privacy and ethics considerations. We want to help make data usable for AI and machine learning, but also ensure that data cannot be used in a wrong or biased way.