Converter: Enhancing Interoperability in Research Data Management
Sefika Efeoglu, Zongxiong Chen, Sonja Schimmler, Bianca Wentzel
TL;DR
The paper addresses interoperability challenges in Research Data Management caused by heterogeneous metadata formats across Berlin University Alliance repositories and their publication vocabularies. It introduces Converter, a schema-matching and transformation pipeline that maps harvested metadata to the DCAT vocabulary (DCAT-AP 2019) and serves as a bridge to the Piveau harvester, enabling DCAT-consistent ingestion. The authors present a pluggable converter service with importers, transformers, and exporters that reduces the need for extensive Piveau adaptations and can be extended to additional data sources within the NFDI initiative, such as NFDI4Cat and NFDI4DataScience. The work improves data accessibility and interoperability by harmonizing heterogeneous resources into a unified DCAT-based framework, supporting FAIR data management and streamlined collaboration across institutions.
Abstract
Research Data Management (RDM) is essential in handling and organizing data in the research field. The Berlin Open Science Platform (BOP) serves as a case study that exemplifies the significance of standardization within the Berlin University Alliance (BUA), employing different vocabularies when publishing their data, resulting in data heterogeneity. The meta portals of the NFDI4Cat and the NFDI4DataScience project serve as additional case studies in the context of the NFDI initiative. To establish consistency among the harvested repositories in the respective systems, this study focuses on developing a novel component, namely the \textit{converter}, that breaks barriers between data collection and various schemas. With the minor modification of the existing Piveau framework, the development of the converter, contributes to enhanced data accessibility, streamlined collaboration, and improved interoperability within the research community.
