AI4DiTraRe: Building the BFO-Compliant Chemotion Knowledge Graph
Ebrahim Norouzi, Nicole Jung, Anna M. Jacyszyn, Jörg Waitelonis, Harald Sack
TL;DR
This work presents a pipeline to transform Chemotion metadata into a BFO-compliant Chemotion Knowledge Graph (Chemotion-KG) by harvesting JSON-LD, converting to RDF, and semantically enriching data with SPARQL CONSTRUCT using NFDICore and ChEBI within Ontology Design Patterns. The approach preserves provenance via named graphs and supports AI-driven reasoning and interoperability, with daily ingestion and a public SPARQL endpoint. As of July 2025, the KG comprises over 1.46 million triples and tens of thousands of instantiated entities, demonstrating scalable semantification of chemical research data. Future work targets broader data inclusion, cross-resource linking (e.g., PubChem, ChemSpider, NFDI4Chem), SHACL validation, competency questions, and integration with AI methods, including LLM-assisted curation and symbolic-statistical AI bridging.
Abstract
Chemistry is an example of a discipline where the advancements of technology have led to multi-level and often tangled and tricky processes ongoing in the lab. The repeatedly complex workflows are combined with information from chemical structures, which are essential to understand the scientific process. An important tool for many chemists is Chemotion, which consists of an electronic lab notebook and a repository. This paper introduces a semantic pipeline for constructing the BFO-compliant Chemotion Knowledge Graph, providing an integrated, ontology-driven representation of chemical research data. The Chemotion-KG has been developed to adhere to the FAIR (Findable, Accessible, Interoperable, Reusable) principles and to support AI-driven discovery and reasoning in chemistry. Experimental metadata were harvested from the Chemotion API in JSON-LD format, converted into RDF, and subsequently transformed into a Basic Formal Ontology-aligned graph through SPARQL CONSTRUCT queries. The source code and datasets are publicly available via GitHub. The Chemotion Knowledge Graph is hosted by FIZ Karlsruhe Information Service Engineering. Outcomes presented in this work were achieved within the Leibniz Science Campus ``Digital Transformation of Research'' (DiTraRe) and are part of an ongoing interdisciplinary collaboration.
