MuCoS: Efficient Drug Target Discovery via Multi Context Aware Sampling in Knowledge Graphs
Haji Gul, Abdul Ghani Naim, Ajaz Ahmad Bhat
TL;DR
MuCoS addresses drug target discovery by reframing KG completion as a prediction task on heterogeneous biomedical graphs and introduces density-based multi-context sampling to extract informative neighborhood patterns. By combining contextual structural information with textual semantics via a BERT-based predictor, MuCoS eliminates the need for negative sampling and improves generalization to unseen drug–target pairs. On KEGG50k, MuCoS achieves up to 13% MRR improvements for general relation prediction and 6% for drug-target relations, while delivering approximately 175x faster training than MuCo-KGC. The method also demonstrates competitive performance on standard KG benchmarks with ablation-supported insights into the importance of head-context, offering a scalable, practical tool for large-scale biomedical KG-driven drug discovery.
Abstract
Accurate prediction of drug target interactions is critical for accelerating drug discovery and elucidating complex biological mechanisms. In this work, we frame drug target prediction as a link prediction task on heterogeneous biomedical knowledge graphs (KG) that integrate drugs, proteins, diseases, pathways, and other relevant entities. Conventional KG embedding methods such as TransE and ComplEx SE are hindered by their reliance on computationally intensive negative sampling and their limited generalization to unseen drug target pairs. To address these challenges, we propose Multi Context Aware Sampling (MuCoS), a novel framework that prioritizes high-density neighbours to capture salient structural patterns and integrates these with contextual embeddings derived from BERT. By unifying structural and textual modalities and selectively sampling highly informative patterns, MuCoS circumvents the need for negative sampling, significantly reducing computational overhead while enhancing predictive accuracy for novel drug target associations and drug targets. Extensive experiments on the KEGG50k dataset demonstrate that MuCoS outperforms state-of-the-art baselines, achieving up to a 13\% improvement in mean reciprocal rank (MRR) in predicting any relation in the dataset and a 6\% improvement in dedicated drug target relation prediction.
