SHACL-SKOS Based Knowledge Representation of Material Safety Data Sheet (SDS) for the Pharmaceutical Industry
Brian Lu, Dennis Pham, Ti-Chiun Chang, Michael Lovette, Terri Bui, Stephen Ma
TL;DR
The paper addresses the challenge of integrating and exchanging Safety Data Sheet (SDS) information across diverse vendors and regulatory regimes in the pharmaceutical industry. It proposes a SHACL-SKOS hybrid knowledge representation, termed DeepPharmGraph (DPG), that decouples data shapes from taxonomies to support multiple, coexisting viewpoints on SDS data. By leveraging SKOS and SKOS-XL for taxonomy and label management and SHACL for structural constraints, the approach enables cross-standard mappings (e.g., GHS, HazCom) and scalable data inference. The implemented Crucible platform demonstrates automated SDS extraction, annotation, and integration into a knowledge graph, culminating in the automated generation of composite packing sheets and a streamlined user experience, significantly reducing manual processing time and enabling broader data sharing across stakeholders.
Abstract
We report the development of a knowledge representation and reasoning (KRR) system built on hybrid SHACL-SKOS ontologies for globally harmonized system (GHS) material Safety Data Sheets (SDS) to enhance chemical safety communication and regulatory compliance. SDS are comprehensive documents containing safety and handling information for chemical substances. Thus, they are an essential part of workplace safety and risk management. However, the vast number of Safety Data Sheets from multiple organizations, manufacturers, and suppliers that produce and distribute chemicals makes it challenging to centralize and access SDS documents through a single repository. To accomplish the underlying issues of data exchange related to chemical shipping and handling, we construct SDS related controlled vocabulary and conditions validated by SHACL, and knowledge systems of similar domains linked via SKOS. The resulting hybrid ontologies aim to provide standardized yet adaptable representations of SDS information, facilitating better data sharing, retrieval, and integration across various platforms. This paper outlines our SHACL-SKOS system architectural design and showcases our implementation for an industrial application streamlining the generation of a composite shipping cover sheet.
