Augmenting generative models with biomedical knowledge graphs improves targeted drug discovery
Aditya Malusare, Vineet Punyamoorty, Vaneet Aggarwal
TL;DR
K-DREAM introduces a knowledge-graph guided diffusion framework for molecule generation that embeds biomedical knowledge from PrimeKG using TransE and steers diffusion with a Context Regressor Network and embedding-space interpolation. This approach yields biologically relevant drug candidates and enables multi-target design, achieving state-of-the-art docking performance across five targets with mean top-5% scores around -12 to -10 kcal/mol, outperforming multiple baselines. Ablation studies confirm the critical role of KG guidance, and interpolation in KG embedding space enables balanced dual-target compounds, illustrating a system-level advancement for rational drug design. By integrating knowledge graphs into generative chemistry, K-DREAM demonstrates a scalable pathway to biologically informed discovery with potential to accelerate preclinical development, while noting the need for experimental validation and awareness of knowledge-graph biases. The strength of the guidance is controlled by the hyperparameter lambda_X, enabling navigation of the trade-off between exploration and target-focused optimization.
Abstract
Recent breakthroughs in generative modeling have demonstrated remarkable capabilities in molecular generation, yet the integration of comprehensive biomedical knowledge into these models has remained an untapped frontier. In this study, we introduce K-DREAM (Knowledge-Driven Embedding-Augmented Model), a novel framework that leverages knowledge graphs to augment diffusion-based generative models for drug discovery. By embedding structured information from large-scale knowledge graphs, K-DREAM directs molecular generation toward candidates with higher biological relevance and therapeutic suitability. This integration ensures that the generated molecules are aligned with specific therapeutic targets, moving beyond traditional heuristic-driven approaches. In targeted drug design tasks, K-DREAM generates drug candidates with improved binding affinities and predicted efficacy, surpassing current state-of-the-art generative models. It also demonstrates flexibility by producing molecules designed for multiple targets, enabling applications to complex disease mechanisms. These results highlight the utility of knowledge-enhanced generative models in rational drug design and their relevance to practical therapeutic development.
