Medical Knowledge Graph QA for Drug-Drug Interaction Prediction based on Multi-hop Machine Reading Comprehension
Peng Gao, Feng Gao, Jian-Cheng Ni, Yu Wang, Fei Wang
TL;DR
DDI prediction benefits from integrating cross-document biomedical knowledge with structured reasoning. MedKGQA combines a knowledge-fusion stage that embeds DrugBank and Reactome relations with a graph-reasoning stage that operates on a four-type heterogeneous graph using co-attention and multi-relational Graph Attention Networks. The approach achieves state-of-the-art accuracy on the MedHop dataset (e.g., a 64.8% test score, up from 60.3%), and ablations show the critical role of the reasoning graph and external knowledge. The work demonstrates the feasibility and value of fusing KB embeddings with graph-based MRC for biomedical QA, with visualization supporting interpretability and potential extension to other closed domains. It offers practical impact by enabling more accurate, knowledge-aware DDI prediction and lays groundwork for broader applications in law, physics, and beyond.
Abstract
Drug-drug interaction prediction is a crucial issue in molecular biology. Traditional methods of observing drug-drug interactions through medical experiments require significant resources and labor. This paper presents a medical knowledge graph question answering model, dubbed MedKGQA, that predicts drug-drug interaction by employing machine reading comprehension from closed-domain literature and constructing a knowledge graph of drug-protein triplets from open-domain documents. The model vectorizes the drug-protein target attributes in the graph using entity embeddings and establishes directed connections between drug and protein entities based on the metabolic interaction pathways of protein targets in the human body. This aligns multiple external knowledge and applies it to learn the graph neural network. Without bells and whistles, the proposed model achieved a 4.5% improvement in terms of drug-drug interaction prediction accuracy compared to previous state-of-the-art models on the Qangaroo MedHop dataset. Experimental results demonstrate the efficiency and effectiveness of the model and verify the feasibility of integrating external knowledge in machine reading comprehension tasks.
