Table of Contents
Fetching ...

Enhancing Drug-Target Interaction Prediction through Transfer Learning from Activity Cliff Prediction Tasks

Regina Ibragimova, Dimitrios Iliadis, Willem Waegeman

TL;DR

The study tackles activity cliff (AC) challenges in drug-target interaction (DTI) prediction by introducing transfer learning from a universal AC predictor. A two-branch encoder model learns AC and DTI tasks, with AC information transferred to improve DTI performance, especially in structurally similar, functionally diverse regions. Experiments on KIBA and BindingDB show that warm-start transfer learning, particularly when transferring both drug and target encoders, yields consistent DTI improvements, while freezing strategies are generally less effective. The approach demonstrates that AC-awareness and protein-contextual encoding can enhance predictive robustness, especially in data-scarce or imbalanced settings, and highlights the practical potential for AC-informed models in early drug discovery.

Abstract

Recently, machine learning (ML) has gained popularity in the early stages of drug discovery. This trend is unsurprising given the increasing volume of relevant experimental data and the continuous improvement of ML algorithms. However, conventional models, which rely on the principle of molecular similarity, often fail to capture the complexities of chemical interactions, particularly those involving activity cliffs (ACs) - compounds that are structurally similar but exhibit evidently different activity behaviors. In this work, we address two distinct yet related tasks: (1) activity cliff (AC) prediction and (2) drug-target interaction (DTI) prediction. Leveraging insights gained from the AC prediction task, we aim to improve the performance of DTI prediction through transfer learning. A universal model was developed for AC prediction, capable of identifying activity cliffs across diverse targets. Insights from this model were then incorporated into DTI prediction, enabling better handling of challenging cases involving ACs while maintaining similar overall performance. This approach establishes a strong foundation for integrating AC awareness into predictive models for drug discovery. Scientific Contribution This study presents a novel approach that applies transfer learning from AC prediction to enhance DTI prediction, addressing limitations of traditional similarity-based models. By introducing AC-awareness, we improve DTI model performance in structurally complex regions, demonstrating the benefits of integrating compound-specific and protein-contextual information. Unlike previous studies, which treat AC and DTI predictions as separate problems, this work establishes a unified framework to address both data scarcity and prediction challenges in drug discovery.

Enhancing Drug-Target Interaction Prediction through Transfer Learning from Activity Cliff Prediction Tasks

TL;DR

The study tackles activity cliff (AC) challenges in drug-target interaction (DTI) prediction by introducing transfer learning from a universal AC predictor. A two-branch encoder model learns AC and DTI tasks, with AC information transferred to improve DTI performance, especially in structurally similar, functionally diverse regions. Experiments on KIBA and BindingDB show that warm-start transfer learning, particularly when transferring both drug and target encoders, yields consistent DTI improvements, while freezing strategies are generally less effective. The approach demonstrates that AC-awareness and protein-contextual encoding can enhance predictive robustness, especially in data-scarce or imbalanced settings, and highlights the practical potential for AC-informed models in early drug discovery.

Abstract

Recently, machine learning (ML) has gained popularity in the early stages of drug discovery. This trend is unsurprising given the increasing volume of relevant experimental data and the continuous improvement of ML algorithms. However, conventional models, which rely on the principle of molecular similarity, often fail to capture the complexities of chemical interactions, particularly those involving activity cliffs (ACs) - compounds that are structurally similar but exhibit evidently different activity behaviors. In this work, we address two distinct yet related tasks: (1) activity cliff (AC) prediction and (2) drug-target interaction (DTI) prediction. Leveraging insights gained from the AC prediction task, we aim to improve the performance of DTI prediction through transfer learning. A universal model was developed for AC prediction, capable of identifying activity cliffs across diverse targets. Insights from this model were then incorporated into DTI prediction, enabling better handling of challenging cases involving ACs while maintaining similar overall performance. This approach establishes a strong foundation for integrating AC awareness into predictive models for drug discovery. Scientific Contribution This study presents a novel approach that applies transfer learning from AC prediction to enhance DTI prediction, addressing limitations of traditional similarity-based models. By introducing AC-awareness, we improve DTI model performance in structurally complex regions, demonstrating the benefits of integrating compound-specific and protein-contextual information. Unlike previous studies, which treat AC and DTI predictions as separate problems, this work establishes a unified framework to address both data scarcity and prediction challenges in drug discovery.
Paper Structure (40 sections, 4 equations, 41 figures, 3 tables)

This paper contains 40 sections, 4 equations, 41 figures, 3 tables.

Figures (41)

  • Figure 1: Schematic representation of the compound-based dataset splitting process for the two main tasks considered in this study: DTI and AC prediction. Initially, the dataset is split into training (green) and testing (red) sets. For the DTI task, drug-target pairs are directly generated within these sets. In contrast, the AC task involves three additional steps (shown at the bottom): first, identifying all drugs that interact with a specific protein in both the training and testing sets; second, pairing these drugs for each protein based on their structural similarity; and third, categorizing each pair as either an AC or non-AC based on predefined criteria. The labels in the figure indicate AC pairs in green, non-AC pairs in orange, and continuous affinity levels in grayscale.
  • Figure 1: UMAP of input features and hidden states of the DDC model's drug encoder (on the BindingDB dataset, random split).
  • Figure 2: Schematic representation of the model architectures used for AC (top) and DTI (bottom) tasks. Both tasks employ separate encoders for drugs (Compound Encoder) and targets (Protein Encoder). In the AC task, the model processes both drugs through the same Compound Encoder, concatenates their embeddings with the target embedding from the Protein Encoder, and predicts whether the pair represents an AC (labeled as 1) or non-AC (labeled as 0) for a given target. For the DTI task, drug and target embeddings are concatenated to predict drug-target affinity as a continuous value, leveraging transferred features from the AC task to improve prediction accuracy for novel compound-protein interactions.
  • Figure 2: Number of pairs per groups in test sets in KIBA random (a) and compound-based (b) splits. It was found that not all the subgroups had a sufficient number of pairs, so those below 100 pairs were masked by grey color. The numbers are shown for pairs of compounds targeting the same protein, meaning that targets with an affinity for only one drug were excluded.
  • Figure 3: The heatmap of the $RMSE_{micro}$ for the best DTI model trained from scratch for the KIBA (left) and BindingDB (right) datasets in the case of a compound-based splits, showing groups of compounds split by similarity and affinity thresholds. The values represent the mean ± standard deviation based on 3 experiments. The groups with fewer than 100 pairs are masked in gray. The model performs well in groups containing both ACs and non-ACs, particularly in the bottom left regions of the heatmaps. However, as the non-ACs are gradually filtered out and the model is faced with more difficult AC predictions, performance significantly declines, particularly in the upper right areas where thresholds for affinity and similarity are high.
  • ...and 36 more figures