PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps

Rakesh Bal; Yijia Xiao; Wei Wang

PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps

Rakesh Bal, Yijia Xiao, Wei Wang

TL;DR

This study tackles the challenge of predicting drug-target binding affinity by leveraging Protein Language Models (PLMs) to replace CNN-based protein encoders and by incorporating contact-map information as inductive bias. It introduces two model variants, PGraphDTA and PGraphDTA-CM, and demonstrates that PLMs improve predictive accuracy over CNN baselines, with larger gains on smaller datasets. The CM2 approach using protein contact maps from pconsc4 yields notable improvements, while CM1 based on DiffDock distances is often noisy. Overall, the work highlights a practical, scalable path to improving DTI predictions and accelerating drug discovery, with publicly available code and data.

Abstract

Developing and discovering new drugs is a complex and resource-intensive endeavor that often involves substantial costs, time investment, and safety concerns. A key aspect of drug discovery involves identifying novel drug-target (DT) interactions. Existing computational methods for predicting DT interactions have primarily focused on binary classification tasks, aiming to determine whether a DT pair interacts or not. However, protein-ligand interactions exhibit a continuum of binding strengths, known as binding affinity, presenting a persistent challenge for accurate prediction. In this study, we investigate various techniques employed in Drug Target Interaction (DTI) prediction and propose novel enhancements to enhance their performance. Our approaches include the integration of Protein Language Models (PLMs) and the incorporation of Contact Map information as an inductive bias within current models. Through extensive experimentation, we demonstrate that our proposed approaches outperform the baseline models considered in this study, presenting a compelling case for further development in this direction. We anticipate that the insights gained from this work will significantly narrow the search space for potential drugs targeting specific proteins, thereby accelerating drug discovery. Code and data for PGraphDTA are available at https://github.com/Yijia-Xiao/PgraphDTA/.

PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps

TL;DR

Abstract

Paper Structure (17 sections, 1 equation, 5 figures, 4 tables)

This paper contains 17 sections, 1 equation, 5 figures, 4 tables.

Introduction
Related Works
Drug Target Interaction (DTI) Models
Graph Neural Networks (GNNs)
Protein Language Models (PLMs)
Materials and Methods
Drug Representation
Protein Representation
Datasets
Models
PGraphDTA (Replacing CNNs with PLMs)
PGraphDTA-CM (Adding Contact Maps information)
Evaluation Metric
Results & Discussion
PGraphDTA
...and 2 more sections

Figures (5)

Figure 1: Example of SMILES string and their corresponding molecule
Figure 2: Baseline GraphDTA and PGraphDTA model architecture. In (b), dashed lines indicate PLM embeddings are precomputed and are not a part of the training loop.
Figure 3: Model with Contact Map information. PLM embeddings are precomputed and, hence, is not a part of the training loop, as indicated by the dashed lines.
Figure 4: Molecular contact map prediction workflow, Source: DGraphDTA jiang2020drug
Figure 5: Protein contact map prediction workflow, Source: DGraphDTA jiang2020drug

PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps

TL;DR

Abstract

PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps

Authors

TL;DR

Abstract

Table of Contents

Figures (5)