Table of Contents
Fetching ...

ETDock: A Novel Equivariant Transformer for Protein-Ligand Docking

Yiqiang Yi, Xu Wan, Yatao Bian, Le Ou-Yang, Peilin Zhao

TL;DR

ETDock introduces an equivariant transformer framework for protein–ligand docking that fuses atomic- and graph-level ligand features and processes multi-level interactions through a TAMformer comprising Triangle, Attention, and Message layers. The model predicts a protein–ligand distance matrix and iteratively refines ligand poses, guided by two losses: a distance-matrix RMSE and a self-confidence term over candidate pockets, resulting in accurate docking predictions on the PDBbind v2020 dataset. Key contributions include a feature fusion block for integrating ligand features, a message layer that exchanges invariant and equivariant information, and a geometry-aware optimization pipeline that produces high-quality binding poses. Empirically, ETDock achieves state-of-the-art results with substantial improvements in pose accuracy (e.g., 23.3% below 2 Å and 61.1% below 5 Å) and robust ablations validating the importance of each component, underscoring the practical impact for drug discovery workflows.

Abstract

Predicting the docking between proteins and ligands is a crucial and challenging task for drug discovery. However, traditional docking methods mainly rely on scoring functions, and deep learning-based docking approaches usually neglect the 3D spatial information of proteins and ligands, as well as the graph-level features of ligands, which limits their performance. To address these limitations, we propose an equivariant transformer neural network for protein-ligand docking pose prediction. Our approach involves the fusion of ligand graph-level features by feature processing, followed by the learning of ligand and protein representations using our proposed TAMformer module. Additionally, we employ an iterative optimization approach based on the predicted distance matrix to generate refined ligand poses. The experimental results on real datasets show that our model can achieve state-of-the-art performance.

ETDock: A Novel Equivariant Transformer for Protein-Ligand Docking

TL;DR

ETDock introduces an equivariant transformer framework for protein–ligand docking that fuses atomic- and graph-level ligand features and processes multi-level interactions through a TAMformer comprising Triangle, Attention, and Message layers. The model predicts a protein–ligand distance matrix and iteratively refines ligand poses, guided by two losses: a distance-matrix RMSE and a self-confidence term over candidate pockets, resulting in accurate docking predictions on the PDBbind v2020 dataset. Key contributions include a feature fusion block for integrating ligand features, a message layer that exchanges invariant and equivariant information, and a geometry-aware optimization pipeline that produces high-quality binding poses. Empirically, ETDock achieves state-of-the-art results with substantial improvements in pose accuracy (e.g., 23.3% below 2 Å and 61.1% below 5 Å) and robust ablations validating the importance of each component, underscoring the practical impact for drug discovery workflows.

Abstract

Predicting the docking between proteins and ligands is a crucial and challenging task for drug discovery. However, traditional docking methods mainly rely on scoring functions, and deep learning-based docking approaches usually neglect the 3D spatial information of proteins and ligands, as well as the graph-level features of ligands, which limits their performance. To address these limitations, we propose an equivariant transformer neural network for protein-ligand docking pose prediction. Our approach involves the fusion of ligand graph-level features by feature processing, followed by the learning of ligand and protein representations using our proposed TAMformer module. Additionally, we employ an iterative optimization approach based on the predicted distance matrix to generate refined ligand poses. The experimental results on real datasets show that our model can achieve state-of-the-art performance.
Paper Structure (27 sections, 29 equations, 7 figures, 3 tables)

This paper contains 27 sections, 29 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: The overall framework of our ETDock model. (a) Feature Processing module: we fuse the atom-level and graph-level features of ligands and calculate the interactive features between ligands and proteins by performing outer product. (b) TAMformer module: consists of a triangle layer, an attention layer, and a message layer. Our model utilizes the triangle layer to capture the physical constraints between ligands and proteins. (c) Attention layer: promotes the learning of ligand and protein features from interaction features. (d) Message layer: we perform message passing on the scalar and equivariant vector features of the ligand, protein, and protein-ligand interactive features to exchange information among them. (e) Scale module: we interact and update the scalar information of the ligand, protein, and protein-ligand interactive features to exchange information among them. (f) Vector module: obtains relative features of the vector and compute scalar features through inner products of the vector features.
  • Figure 2: The workflow of generate the binding ligand pose.
  • Figure 3: The experimental results on the hyperparameter $\beta$.
  • Figure 4: The experimental results on number of iterations during ligand pose generation. We utilize the average RMSD of the ligand to assess the impact of the number of iterations in the optimization algorithm.
  • Figure 5: The frequency histograms of Ligand RMSD(left) and Centeroid Distances(right) predicted by ETDock on the test set. These histograms provide a visual representation of the distribution and occurrence frequency of the RMSD and the centeroid distance for the predicted ligands,
  • ...and 2 more figures