SIDEKICK: A Semantically Integrated Resource for Drug Effects, Indications, and Contraindications
Mohammad Ashhad, Olga Mashkova, Ricardo Henao, Robert Hoehndorf
TL;DR
Pharmacovigilance datasets often rely on MedDRA, which constrains semantic reasoning and interoperability. The authors present SIDEKICK, a knowledge graph that semantically integrates drug indications, contraindications, and adverse reactions by mapping FDA SPLs to HPO, MONDO, and RxNorm, and serializing the result as RDF using SIO as an upper ontology. The workflow combines LLM extraction with Graph‑RAG ontology mapping, followed by rigorous schema validation (ShEx) and ELK reasoning, achieving superior drug target prediction via side‑effect similarity compared to OnSIDES. SIDEKICK enables automated safety surveillance and phenotype‑based drug repurposing, and supports complex, federated SPARQL queries in the Semantic Web ecosystem. The resource is openly available with a web interface, SPARQL endpoint, and accompanying code and tutorials.
Abstract
Pharmacovigilance and clinical decision support systems utilize structured drug safety data to guide medical practice. However, existing datasets frequently depend on terminologies such as MedDRA, which limits their semantic reasoning capabilities and their interoperability with Semantic Web ontologies and knowledge graphs. To address this gap, we developed SIDEKICK, a knowledge graph that standardizes drug indications, contraindications, and adverse reactions from FDA Structured Product Labels. We developed and used a workflow based on Large Language Model (LLM) extraction and Graph-Retrieval Augmented Generation (Graph RAG) for ontology mapping. We processed over 50,000 drug labels and mapped terms to the Human Phenotype Ontology (HPO), the MONDO Disease Ontology, and RxNorm. Our semantically integrated resource outperforms the SIDER and ONSIDES databases when applied to the task of drug repurposing by side effect similarity. We serialized the dataset as a Resource Description Framework (RDF) graph and employed the Semanticscience Integrated Ontology (SIO) as upper level ontology to further improve interoperability. Consequently, SIDEKICK enables automated safety surveillance and phenotype-based similarity analysis for drug repurposing.
