KITE-DDI: A Knowledge graph Integrated Transformer Model for accurately predicting Drug-Drug Interaction Events from Drug SMILES and Biomedical Knowledge Graph

Azwad Tamir; Jiann-Shiun Yuan

KITE-DDI: A Knowledge graph Integrated Transformer Model for accurately predicting Drug-Drug Interaction Events from Drug SMILES and Biomedical Knowledge Graph

Azwad Tamir, Jiann-Shiun Yuan

TL;DR

The paper addresses predicting Drug-Drug Interaction Events by leveraging both drug SMILES and a biomedical knowledge graph. It introduces KITE-DDI, an encoder-only Transformer that fuses DRKG embeddings (dimension $800$) with SMILES-derived sequences (up to 500 tokens) and is pretrained with a Masked Language Modeling objective before supervised fine-tuning, yielding an end-to-end pipeline. Key contributions include a heuristic-free, lightweight architecture that excels in inductive generalization across two DrugBank-based benchmarks, outperforming state-of-the-art baselines especially in cold-start scenarios. The approach demonstrates robustness to data scarcity and varying SMILES lengths, with practical implications for predicting DDIs for newly developed drugs and reducing reliance on wet-lab experiments.

Abstract

It is a common practice in modern medicine to prescribe multiple medications simultaneously to treat diseases. However, these medications could have adverse reactions between them, known as Drug-Drug Interactions (DDI), which have the potential to cause significant bodily injury and could even be fatal. Hence, it is essential to identify all the DDI events before prescribing multiple drugs to a patient. Most contemporary research for predicting DDI events relies on either information from Biomedical Knowledge graphs (KG) or drug SMILES, with very few managing to merge data from both to make predictions. While others use heuristic algorithms to extract features from SMILES and KGs, which are then fed into a Deep Learning framework to generate output. In this study, we propose a KG-integrated Transformer architecture to generate an end-to-end fully automated Machine Learning pipeline for predicting DDI events with high accuracy. The algorithm takes full-scale molecular SMILES sequences of a pair of drugs and a biomedical KG as input and predicts the interaction between the two drugs with high precision. The results show superior performance in two different benchmark datasets compared to existing state-of-the-art models especially when the test and training sets contain distinct sets of drug molecules. This demonstrates the strong generalization of the proposed model, indicating its potential for DDI event prediction for newly developed drugs. The model does not depend on heuristic models for generating embeddings and has a minimal number of hyperparameters, making it easy to use while demonstrating outstanding performance in low-data scenarios.

KITE-DDI: A Knowledge graph Integrated Transformer Model for accurately predicting Drug-Drug Interaction Events from Drug SMILES and Biomedical Knowledge Graph

TL;DR

Abstract

KITE-DDI: A Knowledge graph Integrated Transformer Model for accurately predicting Drug-Drug Interaction Events from Drug SMILES and Biomedical Knowledge Graph

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)