Most data spaces start with a connector implementation derived from an existing one, but then add their own “secret sauce” to customize it. Peter talks about the pain of missing interoperability, “We saw an entire zoo of different connectors and connector projects emerge. And these projects were partially not interoperable with each other within the same data space.” So how can data space interoperability be ensured?
Three guiding principles for data spaces
Peter describes the objectives of the working groups in creating a protocol for data spaces, which required establishing a set of guiding principles as a foundation. The protocol’s design decisions aim to adhere to these principles.
First, the data sovereignty of all participants is of the utmost importance. The protocol’s design must enable participants to act with choice, to have agency and autonomy regarding their actions in the data space. That is the absolute guiding principle of the protocol. Individual connectors may make partial changes to these principles, but in the protocol itself, enabling the data sovereignty of a participant must be the highest good.
Second, a data space is a context of trust; it is not a physical environment or a marketplace. For the protocol, a data space is primarily a space to establish trust between two sovereign participants.
Third, the rules for a data space are like a lawbook that is provided by the government. Similarly, there are parts of the protocol that need to be provided in a specific implementation by a Dataspace Authority. Therefore, certain capabilities and services of the Dataspace Authority may need to be provided centrally. This authority should act as a governance body that defines the laws and rules of the data space, without interfering with the data sovereignty of the participants.
Why the Dataspace Protocol is needed
There are many data space projects with very different requirements in Europe but also in Asia, the Americas, and beyond. “This is truly an international phenomenon, and there is a need for trusted data sharing to generate value from data,” Peter states. “That is why we want to build a protocol that has global validity.”
Various types and architectures of data spaces can be seen worldwide. Many people talk about big data spaces like Catena-X or the European Health Data Space, but there is also a need for smaller, agile data spaces. A big variety of data spaces exists, some of which operate for decades with thousands of participants, while others may exist only for one day with a small number of actors. Some data spaces have a centralized structure built around an organization or a government body, or they are decentralized data spaces adhering to a common rulebook but not bound by a central association. The protocol should ensure a minimum viable interoperability between these different frameworks, products, or services.
Two interoperability models
Another point to consider is that there are two interoperability models: intra-data space, which is the interoperability of different connectors from different participants within the setting of one data space, and inter-data space, which is the interoperability between data spaces.
The latter absolutely requires the connector protocol-based element of interoperability because a data space itself does not act on behalf of its participant but enables participants to trust each other and to share data with each other. If a participant needs interoperability between two data spaces, it means that the participant must actually participate in two data spaces. This would get extremely complicated if there were different protocols, connectors, and architectures in those data spaces. Therefore, the protocol itself is the basis for a much broader and more complex discussion on how we can make data spaces interoperable with each other.
Interoperability standards
Peter said, “We also looked at the existing interoperability standards.” One of them is the ISO 19941 Cloud Computing Interoperability and Portability, which is referenced in the EU Data Act. The other one is the European Interoperability Framework (EIF), which aims to create a digital single market in Europe. It will improve interoperability for data sharing, boost trust and security, as well as encourage investments. These two models are almost identical.
Their structures are similar: both have a ‘legal’/’policies’ layer; the ‘technical’ layer of the EIF is divided into a ‘syntactic’ and a ‘transport’ layer in the ISO standard; the ‘organizational’ layer of the EIF is called ‘behavior’ in the ISO model; and both have a ‘semantics’ layer.
The Dataspace Protocol is aimed at the technical layer: Am I technically able to communicate with the other party? In the ISO model, the technical layer is split in two: transport and syntactic layers. Do I have the same transport protocol, and do I have the same syntactic model on top of that? Peter states, ‘We are referring to the EIF model here because it bundles the two into one technical layer, and that is where we see the Dataspace Protocol.’
Layers of Interoperability
The interoperability standards are mapped at IDSA. For the legal layer, questions to be answered are: Do I have legal building blocks I could use for data contract policies, for usage policies? Are contractual statements legally equivalent? These questions are worked on in the IDSA Taskforce Legal. The organizational layer is about business procedures: How does a participant join a data space? What are the responsibilities of the Dataspace Authority within the data space? What is the process of agreeing on trust? Those are the things worked on in the IDSA Working Group Rulebook.
At the semantic layer, the question is whether the policies and attributes used to create trust in a data space are interoperable between two data space authorities. This includes semantic models in those data spaces. If we look at data spaces like Catena-X and SCSN, for example, there are industrial semantic models that need to be interoperable between those data spaces.
Minimum standard of communication
As said above, the layer addressed by the protocol is the technical layer. How do we communicate with different connectors from different frameworks, different services? “That is where the Dataspace Protocol is created, to define the minimum standard of communication so that everybody who is proficient in the Dataspace Protocol is able to communicate with other connectors, even if those other connectors are adding features, semantic models, or business procedures,” Peter emphasizes. “The Dataspace Protocol will always be the minimum layer for data discovery and data contract negotiation.”
Peter is also contributing to an open source project, the Eclipse Dataspace Components (EDC), which is aiming at providing a framework that acts as a reference implementation of the protocol so that products and services that are built on top of the Eclipse Dataspace Components framework are automatically implementing the Dataspace Protocol correctly and are thus compatible with each other by using this protocol through the EDC. It is a great achievement and a big step towards more and easier data sharing for many.