Table of Contents
Fetching ...

Fast and Accurate Blind Flexible Docking

Zizhuo Zhang, Lijun Wu, Kaiyuan Gao, Jiangchao Yao, Tao Qin, Bo Han

TL;DR

The paper tackles blind flexible docking by introducing FABFlex, a regression-based multi-task model that jointly predicts binding pocket residues and holo structures of both ligand and pocket from apo inputs. It combines three specialized modules—pocket prediction, holo-ligand docking, and holo-pocket docking—via an iterative coordinate-update mechanism, enabling end-to-end inference without external pocket detectors. On the PDBBind v2020 benchmark, FABFlex achieves strong ligand-precision (e.g., $< 2\AA$ RMSD in 40.59% of cases) and competitive pocket accuracy, while delivering substantial speedups (approximately $208\times$ faster than the latest diffusion-based flexible docking method). This approach yields a practical, efficient solution for realistic docking scenarios, with demonstrated generalization to unseen proteins and potential impact on accelerating drug discovery.

Abstract

Molecular docking that predicts the bound structures of small molecules (ligands) to their protein targets, plays a vital role in drug discovery. However, existing docking methods often face limitations: they either overlook crucial structural changes by assuming protein rigidity or suffer from low computational efficiency due to their reliance on generative models for structure sampling. To address these challenges, we propose FABFlex, a fast and accurate regression-based multi-task learning model designed for realistic blind flexible docking scenarios, where proteins exhibit flexibility and binding pocket sites are unknown (blind). Specifically, FABFlex's architecture comprises three specialized modules working in concert: (1) A pocket prediction module that identifies potential binding sites, addressing the challenges inherent in blind docking scenarios. (2) A ligand docking module that predicts the bound (holo) structures of ligands from their unbound (apo) states. (3) A pocket docking module that forecasts the holo structures of protein pockets from their apo conformations. Notably, FABFlex incorporates an iterative update mechanism that serves as a conduit between the ligand and pocket docking modules, enabling continuous structural refinements. This approach effectively integrates the three subtasks of blind flexible docking-pocket identification, ligand conformation prediction, and protein flexibility modeling-into a unified, coherent framework. Extensive experiments on public benchmark datasets demonstrate that FABFlex not only achieves superior effectiveness in predicting accurate binding modes but also exhibits a significant speed advantage (208 $\times$) compared to existing state-of-the-art methods. Our code is released at https://github.com/tmlr-group/FABFlex.

Fast and Accurate Blind Flexible Docking

TL;DR

The paper tackles blind flexible docking by introducing FABFlex, a regression-based multi-task model that jointly predicts binding pocket residues and holo structures of both ligand and pocket from apo inputs. It combines three specialized modules—pocket prediction, holo-ligand docking, and holo-pocket docking—via an iterative coordinate-update mechanism, enabling end-to-end inference without external pocket detectors. On the PDBBind v2020 benchmark, FABFlex achieves strong ligand-precision (e.g., RMSD in 40.59% of cases) and competitive pocket accuracy, while delivering substantial speedups (approximately faster than the latest diffusion-based flexible docking method). This approach yields a practical, efficient solution for realistic docking scenarios, with demonstrated generalization to unseen proteins and potential impact on accelerating drug discovery.

Abstract

Molecular docking that predicts the bound structures of small molecules (ligands) to their protein targets, plays a vital role in drug discovery. However, existing docking methods often face limitations: they either overlook crucial structural changes by assuming protein rigidity or suffer from low computational efficiency due to their reliance on generative models for structure sampling. To address these challenges, we propose FABFlex, a fast and accurate regression-based multi-task learning model designed for realistic blind flexible docking scenarios, where proteins exhibit flexibility and binding pocket sites are unknown (blind). Specifically, FABFlex's architecture comprises three specialized modules working in concert: (1) A pocket prediction module that identifies potential binding sites, addressing the challenges inherent in blind docking scenarios. (2) A ligand docking module that predicts the bound (holo) structures of ligands from their unbound (apo) states. (3) A pocket docking module that forecasts the holo structures of protein pockets from their apo conformations. Notably, FABFlex incorporates an iterative update mechanism that serves as a conduit between the ligand and pocket docking modules, enabling continuous structural refinements. This approach effectively integrates the three subtasks of blind flexible docking-pocket identification, ligand conformation prediction, and protein flexibility modeling-into a unified, coherent framework. Extensive experiments on public benchmark datasets demonstrate that FABFlex not only achieves superior effectiveness in predicting accurate binding modes but also exhibits a significant speed advantage (208 ) compared to existing state-of-the-art methods. Our code is released at https://github.com/tmlr-group/FABFlex.

Paper Structure

This paper contains 42 sections, 11 equations, 13 figures, 11 tables, 1 algorithm.

Figures (13)

  • Figure 1: The two cases illustrate our motivation. These two cases, involving PDB 6HHJ and PDB 6OIM, highlight the structural discrepancy between apo proteins (AlphaFold2) and holo proteins. In the two cases, existing rigid docking method FABind pei:2024:NIPS:fabind yields incorrect molecular docking results when apo proteins are inputted as direct substitutes for the original holo proteins.
  • Figure 2: The overview of proposed FABFlex model, which consists of a pocket prediction module, a ligand docking module, and a pocket docking module. The pocket prediction module identifies pocket residues within the protein. Based on the predicted binding pocket sites, the ligand docking module and pocket docking module predict the holo structures of the ligand and pocket, respectively. An iterative update mechanism facilitates the exchange of predictions between the ligand and pocket docking modules, enabling further coordinate refinements. These modules work together within a unified end-to-end model for the blind flexible docking scenario, i.e., "(apo protein, apo ligand) $\rightarrow$ (holo pocket, holo ligand)".
  • Figure 3: Pocket performance comparison of blind flexible docking. The left figure shows the cumulative percentage of pocket RMSD on all test cases, while the right figure evaluates on those cases with protein receptors that were unseen during training process.
  • Figure 4: Case PDB 6OIM to intuitively present the process of iterative update.
  • Figure 5: Two case studies of PDB 6OIM (left) and PDB 6ORT (right).
  • ...and 8 more figures