Table of Contents
Fetching ...

FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

Kaiyuan Gao, Qizhi Pei, Gongbo Zhang, Jinhua Zhu, Kun He, Lijun Wu

TL;DR

FABind+ targets the pocket-prediction bottleneck in protein–ligand docking by introducing a dynamic pocket radius, permutation-aware docking, and a light sampling pathway that can be activated without retraining. It combines a regression backbone with clustering and dropout sampling plus a lightweight confidence model to produce diverse poses and select high-quality conformations, achieving superior or competitive accuracy with faster inference than leading sampling methods on the PDBBind v2020 benchmark. Key contributions include dynamic radius prediction, permutation-invariant loss, pocket-variant sampling via DBSCAN, dropout-based conformation generation, and an on-the-fly confidence scorer, all validated on blind docking tasks and unseen proteins. The approach has practical impact for rapid and reliable docking in drug discovery, offering both a fast regression pipeline and a scalable sampling mode without heavy generative training.

Abstract

Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with a focus on speed and accuracy, we present FABind+, an enhanced iteration that largely boosts the performance of its predecessor. We identify pocket prediction as a critical bottleneck in molecular docking and propose a novel methodology that significantly refines pocket prediction, thereby streamlining the docking process. Furthermore, we introduce modifications to the docking module to enhance its pose generation capabilities. In an effort to bridge the gap with conventional sampling/generative methods, we incorporate a simple yet effective sampling technique coupled with a confidence model, requiring only minor adjustments to the regression framework of FABind. Experimental results and analysis reveal that FABind+ remarkably outperforms the original FABind, achieves competitive state-of-the-art performance, and delivers insightful modeling strategies. This demonstrates FABind+ represents a substantial step forward in molecular docking and drug discovery. Our code is in https://github.com/QizhiPei/FABind.

FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

TL;DR

FABind+ targets the pocket-prediction bottleneck in protein–ligand docking by introducing a dynamic pocket radius, permutation-aware docking, and a light sampling pathway that can be activated without retraining. It combines a regression backbone with clustering and dropout sampling plus a lightweight confidence model to produce diverse poses and select high-quality conformations, achieving superior or competitive accuracy with faster inference than leading sampling methods on the PDBBind v2020 benchmark. Key contributions include dynamic radius prediction, permutation-invariant loss, pocket-variant sampling via DBSCAN, dropout-based conformation generation, and an on-the-fly confidence scorer, all validated on blind docking tasks and unseen proteins. The approach has practical impact for rapid and reliable docking in drug discovery, offering both a fast regression pipeline and a scalable sampling mode without heavy generative training.

Abstract

Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with a focus on speed and accuracy, we present FABind+, an enhanced iteration that largely boosts the performance of its predecessor. We identify pocket prediction as a critical bottleneck in molecular docking and propose a novel methodology that significantly refines pocket prediction, thereby streamlining the docking process. Furthermore, we introduce modifications to the docking module to enhance its pose generation capabilities. In an effort to bridge the gap with conventional sampling/generative methods, we incorporate a simple yet effective sampling technique coupled with a confidence model, requiring only minor adjustments to the regression framework of FABind. Experimental results and analysis reveal that FABind+ remarkably outperforms the original FABind, achieves competitive state-of-the-art performance, and delivers insightful modeling strategies. This demonstrates FABind+ represents a substantial step forward in molecular docking and drug discovery. Our code is in https://github.com/QizhiPei/FABind.
Paper Structure (27 sections, 5 equations, 13 figures, 6 tables)

This paper contains 27 sections, 5 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Pocket prediction is critical for docking. Left: a good case with correct pocket and docking pose prediction. Right: a bad case with incorrect pocket prediction.
  • Figure 2: Overall framework of FABind+. Left: Pipeline of FABind with the newly proposed dynamic pocket radius prediction and permutation loss module. The protein-ligand complex graph, with pink and blue nodes representing protein and ligand atoms respectively, is first processed by the pocket prediction module to identify the binding pocket. The identified pocket (highlighted in dark pink) is then utilized for docking in the pose prediction module. Right: The sampling version of FABind+, which contains a pocket clustering module, a conformation sampling module, and a confidence selection module.
  • Figure 3: Analysis of pocket prediction and predicted ligand RMSD. "Max Distance" is the distance between the predicted pocket center and the farthest ground truth ligand atom.
  • Figure 4: Confidence model training pipeline. We add lightweight MLP layers upon the fixed FABind+ model for the confidence model, which incorporates a ranking loss and a classification loss for training the confidence model.
  • Figure 5: Scaling curve with increasing sample size. " Perfect Selection" refers to refers to choosing the samples with the lowest RMSD at each given sample size.
  • ...and 8 more figures