Table of Contents
Fetching ...

FAIR Digital Objects for the Realization of Globally Aligned Data Spaces

Nicolas Blumenroehr, Philipp-Joachim Ost, Felix Kraus, Achim Streit

TL;DR

The paper tackles interoperability barriers in global data spaces by introducing FAIR Digital Objects (FDOs) and a formal data model that binds each FDO to a Kernel Information Profile (KIP) within a data-space-agnostic framework. It defines core concepts such as information records, mandatory kernel attributes, and a PID-backed entity-relationship graph to enable machine-actionable decisions without altering existing data spaces. Through two cross-domain use cases (energy research and digital humanities) and a comparative analysis against existing FDO specifications (PIDINST, DARIAH, DiSSCo), the authors demonstrate both the practical viability and current limitations of achieving uniform FAIRification. The results indicate that a formal FDO model can support globally aligned data spaces by standardizing metadata, PIDs, and interlinking while highlighting adoption challenges, governance needs, and future tooling requirements for large-scale big data contexts.

Abstract

The FAIR principles are globally accepted guidelines for improved data management practices with the potential to align data spaces on a global scale. In practice, this is only marginally achieved through the different ways in which organizations interpret and implement these principles. The concept of FAIR Digital Objects provides a way to realize a domain-independent abstraction layer that could solve this problem, but its specifications are currently diverse, contradictory, and restricted to semantic models. In this work, we introduce a rigorously formalized data model with a set of assertions using formal expressions to provide a common baseline for the implementation of FAIR Digital Objects. The model defines how these objects enable machine-actionable decisions based on the principles of abstraction, encapsulation, and entity relationship to fulfill FAIR criteria for the digital resources they represent. We provide implementation examples in the context of two use cases and explain how our model can facilitate the (re)use of data across domains. We also compare how our model assertions are met by FAIR Digital Objects as they have been described in other projects. Finally, we discuss our results' adoption criteria, limitations, and perspectives in the big data context. Overall, our work represents an important milestone for various communities working towards globally aligned data spaces through FAIRification.

FAIR Digital Objects for the Realization of Globally Aligned Data Spaces

TL;DR

The paper tackles interoperability barriers in global data spaces by introducing FAIR Digital Objects (FDOs) and a formal data model that binds each FDO to a Kernel Information Profile (KIP) within a data-space-agnostic framework. It defines core concepts such as information records, mandatory kernel attributes, and a PID-backed entity-relationship graph to enable machine-actionable decisions without altering existing data spaces. Through two cross-domain use cases (energy research and digital humanities) and a comparative analysis against existing FDO specifications (PIDINST, DARIAH, DiSSCo), the authors demonstrate both the practical viability and current limitations of achieving uniform FAIRification. The results indicate that a formal FDO model can support globally aligned data spaces by standardizing metadata, PIDs, and interlinking while highlighting adoption challenges, governance needs, and future tooling requirements for large-scale big data contexts.

Abstract

The FAIR principles are globally accepted guidelines for improved data management practices with the potential to align data spaces on a global scale. In practice, this is only marginally achieved through the different ways in which organizations interpret and implement these principles. The concept of FAIR Digital Objects provides a way to realize a domain-independent abstraction layer that could solve this problem, but its specifications are currently diverse, contradictory, and restricted to semantic models. In this work, we introduce a rigorously formalized data model with a set of assertions using formal expressions to provide a common baseline for the implementation of FAIR Digital Objects. The model defines how these objects enable machine-actionable decisions based on the principles of abstraction, encapsulation, and entity relationship to fulfill FAIR criteria for the digital resources they represent. We provide implementation examples in the context of two use cases and explain how our model can facilitate the (re)use of data across domains. We also compare how our model assertions are met by FAIR Digital Objects as they have been described in other projects. Finally, we discuss our results' adoption criteria, limitations, and perspectives in the big data context. Overall, our work represents an important milestone for various communities working towards globally aligned data spaces through FAIRification.

Paper Structure

This paper contains 21 sections, 9 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: The current state of partially aligned FAIRified data spaces is illustrated by the Euler diagram on the left. Conversely, the diagram on the right includes an enclosed area that represents the abstraction layer which enables alignment across data spaces by providing an overarching FAIRified structure at the meta level.
  • Figure 2: The semantic FDO Data Model specification depicting the relationships between FDO components and principles adopted from other fields of computer science, i.e., Abstraction, Encapsulation, and Entity Relationship.
  • Figure 3: The conceptual model of PID triples based on the FDO's entity relationship characteristics in the spirit of RDF triples, connecting an FDOsub with an FDOobj by a typed attribute key working as predicate.
  • Figure 4: An exemplary FDO according to the formalized data model that contains a set of non-referencing and referencing typed attributes in its information record. The latter enables entity relationships, including FDO-FDO relations (pointed out with a blue arrow) by PID-triples.
  • Figure 5: Resuming the illustration of a high-level abstraction layer around individual data spaces, FDOs can be considered retrievable and operable objects within this layer, representing digital resources within data spaces. A framework that uses their type system, operations, and entity linkage finally enables interoperability between the data spaces the FDOs point to.