In the realm of the data economy, having a common language plays a crucial role, because participants within a data space need to understand each other via a standardized semantic framework for seamless data interpretation. However, this can be tricky: Different organizations and systems often use their own terms and structures to describe data. That’s why semantic interoperability is an essential requirement for federated data networks such as IDS.
Semantic interoperability refers to the ability to share data in a way that ensures mutual understanding and clarity of the meaning of that data, essentially preserving data semantics consistently across disparate systems. In other words, it’s about making sure that data from one IT system can be interpreted accurately and consistently by another IT system, even if they were developed independently and might use different technical implementations.
A dictionary for the data economy
The meaning of data results when common sets of generally accepted, standardized terms – so-called vocabularies – are used to describe it (derived from the application of standardized ontologies). Vocabularies help establish a common language that everyone within the data ecosystem can understand. Like dictionaries, vocabularies help eliminate ambiguity and ensure that the information conveyed remains consistent across different systems and contexts.
The importance of semantics in data spaces is clearly reflected in the work of IDSA. According to the IDS RAM (Reference Architecture Model), the main responsibility for this common language lies with an intermediary role called a vocabulary provider. This role manages and offers vocabularies such as ontologies, data models, and schemata. Within IDS, vocabularies must be machine-readable, including their descriptions and titles to some extent. Additionally, new terms should be easy to find for searches.
Standardization across the ecosystem
The IDS Information Model acts as the universal language that all IDS components share. It defines common concepts, data structures, and standards that enable interoperability and data exchange among different components and systems within the IDS framework. The information model provides a standardized way for IDS components to communicate and understand the data being exchanged, promoting consistency and compatibility across the ecosystem. Its primary objective is to facilitate the description, publication, and discovery of data products and data processing software within the IDS.
Yet, the IDS Information Model represents only the fundamental terminology applicable to all IDS use cases. However, specific domains require more detailed and context-specific terms, leading to the use of domain-specific vocabularies. A recommended practice involves extending the basic information model with supplementary vocabularies, which are made available similarly to the core model.
The vocabulary hub is a collaborative space
Achieving this necessitates a dedicated service, the IDS vocabulary hub. This service hosts, maintains, publishes, and documents additional vocabularies. It provides IDS-compliant endpoints, facilitating seamless communication with data connectors and infrastructure components. Vocabulary hubs grant access to defined terms, their descriptions, changes, and various versions. They function as management platforms for data schemes utilized in IDS scenarios.
The vocabulary hub serves as a collaborative space for vocabulary users and providers, fostering semantic interoperability. It enables users to learn about available vocabularies and how to use them, to edit or customize their own vocabulary based on an existing (standardized) vocabulary and publish these customized vocabularies so that others can reuse them. They receive support during implementation, such as in the form of documentation, feedback from users, organization of maintenance and further development of their vocabularies.
To register vocabularies in the IDS vocabulary hub, the data provider must first create a self-description. Once the vocabulary is published by the data provider, it becomes part of the IDS vocabulary hub after it is validated, and its quality is checked. Vocabularies are a result of parties working together and achieving a shared understanding. This is an ongoing process that requires an open community to ensure these technical artifacts become and remain fit for their purpose.
Why is semantic interoperability so important?
Technically, semantic interoperability helps to understand data better. At the organizational level, understanding increases the data’s value and its quality. It also speeds up innovation: data-driven solutions can be developed more efficiently.
However, achieving semantic operability is not easy. Factors that influence the difficulty come from organizational processes, the market a company is in, and the products and services offered. The organizational context is shaped by the legal context, the rules and regulations that the organizations need to adhere to, the relevant concepts and terms in information flow. For semantic interoperability, it helps if an organization participates in data spaces that operate in similar domains and share a common legal context.
By using vocabularies, data can be (re-)used more efficiently in various applications and scenarios. It becomes easier to integrate and analyze data from different sources, which is crucial in today’s data-driven world.
When the meaning of shared data is unclear, it can lead to misinterpretation, resulting in errors and poor decisions. By leveraging standardized terms and the power of technology, the IDS ecosystem advances toward a future of effective data sharing and comprehensive understanding.