PGraphDTA: Improving Drug Target Interaction Prediction using Protein Language Models and Contact Maps
Rakesh Bal, Yijia Xiao, Wei Wang
TL;DR
This study tackles the challenge of predicting drug-target binding affinity by leveraging Protein Language Models (PLMs) to replace CNN-based protein encoders and by incorporating contact-map information as inductive bias. It introduces two model variants, PGraphDTA and PGraphDTA-CM, and demonstrates that PLMs improve predictive accuracy over CNN baselines, with larger gains on smaller datasets. The CM2 approach using protein contact maps from pconsc4 yields notable improvements, while CM1 based on DiffDock distances is often noisy. Overall, the work highlights a practical, scalable path to improving DTI predictions and accelerating drug discovery, with publicly available code and data.
Abstract
Developing and discovering new drugs is a complex and resource-intensive endeavor that often involves substantial costs, time investment, and safety concerns. A key aspect of drug discovery involves identifying novel drug-target (DT) interactions. Existing computational methods for predicting DT interactions have primarily focused on binary classification tasks, aiming to determine whether a DT pair interacts or not. However, protein-ligand interactions exhibit a continuum of binding strengths, known as binding affinity, presenting a persistent challenge for accurate prediction. In this study, we investigate various techniques employed in Drug Target Interaction (DTI) prediction and propose novel enhancements to enhance their performance. Our approaches include the integration of Protein Language Models (PLMs) and the incorporation of Contact Map information as an inductive bias within current models. Through extensive experimentation, we demonstrate that our proposed approaches outperform the baseline models considered in this study, presenting a compelling case for further development in this direction. We anticipate that the insights gained from this work will significantly narrow the search space for potential drugs targeting specific proteins, thereby accelerating drug discovery. Code and data for PGraphDTA are available at https://github.com/Yijia-Xiao/PgraphDTA/.
