Table of Contents
Fetching ...

Apo2Mol: 3D Molecule Generation via Dynamic Pocket-Aware Diffusion Models

Xinzhe Zheng, Shiyu Jiang, Gustavo Seabra, Chenglong Li, Yanjun Li

TL;DR

Apo2Mol presents a diffusion-based framework that jointly generates 3D ligands and holo-pocket conformations from apo protein states, directly addressing pocket flexibility in structure-based drug design. It leverages a large experimentally resolved apo–holo dataset and a SE(3)-equivariant hierarchical graph to model detailed ligand–pocket interactions and residue-level pocket dynamics without relying on MD simulations. The method demonstrates state-of-the-art binding affinity and drug-likeness metrics in apo-to-holo generation and maintains competitive performance when baselines are evaluated on holo pockets, while ablations confirm the importance of the complex graph and quaternion-based transformations. These results advance practical SBDD by enabling dynamic, data-driven generation of ligand–pocket complexes from apo structures, with implications for targets lacking bound-state templates. However, a remaining gap in reproducing some fine-grained pocket conformations suggests future work on broader pretraining and refinement strategies to further close the holo-pocket distribution gap.

Abstract

Deep generative models are rapidly advancing structure-based drug design, offering substantial promise for generating small molecule ligands that bind to specific protein targets. However, most current approaches assume a rigid protein binding pocket, neglecting the intrinsic flexibility of proteins and the conformational rearrangements induced by ligand binding, limiting their applicability in practical drug discovery. Here, we propose Apo2Mol, a diffusion-based generative framework for 3D molecule design that explicitly accounts for conformational flexibility in protein binding pockets. To support this, we curate a dataset of over 24,000 experimentally resolved apo-holo structure pairs from the Protein Data Bank, enabling the characterization of protein structure changes associated with ligand binding. Apo2Mol employs a full-atom hierarchical graph-based diffusion model that simultaneously generates 3D ligand molecules and their corresponding holo pocket conformations from input apo states. Empirical studies demonstrate that Apo2Mol can achieve state-of-the-art performance in generating high-affinity ligands and accurately capture realistic protein pocket conformational changes.

Apo2Mol: 3D Molecule Generation via Dynamic Pocket-Aware Diffusion Models

TL;DR

Apo2Mol presents a diffusion-based framework that jointly generates 3D ligands and holo-pocket conformations from apo protein states, directly addressing pocket flexibility in structure-based drug design. It leverages a large experimentally resolved apo–holo dataset and a SE(3)-equivariant hierarchical graph to model detailed ligand–pocket interactions and residue-level pocket dynamics without relying on MD simulations. The method demonstrates state-of-the-art binding affinity and drug-likeness metrics in apo-to-holo generation and maintains competitive performance when baselines are evaluated on holo pockets, while ablations confirm the importance of the complex graph and quaternion-based transformations. These results advance practical SBDD by enabling dynamic, data-driven generation of ligand–pocket complexes from apo structures, with implications for targets lacking bound-state templates. However, a remaining gap in reproducing some fine-grained pocket conformations suggests future work on broader pretraining and refinement strategies to further close the holo-pocket distribution gap.

Abstract

Deep generative models are rapidly advancing structure-based drug design, offering substantial promise for generating small molecule ligands that bind to specific protein targets. However, most current approaches assume a rigid protein binding pocket, neglecting the intrinsic flexibility of proteins and the conformational rearrangements induced by ligand binding, limiting their applicability in practical drug discovery. Here, we propose Apo2Mol, a diffusion-based generative framework for 3D molecule design that explicitly accounts for conformational flexibility in protein binding pockets. To support this, we curate a dataset of over 24,000 experimentally resolved apo-holo structure pairs from the Protein Data Bank, enabling the characterization of protein structure changes associated with ligand binding. Apo2Mol employs a full-atom hierarchical graph-based diffusion model that simultaneously generates 3D ligand molecules and their corresponding holo pocket conformations from input apo states. Empirical studies demonstrate that Apo2Mol can achieve state-of-the-art performance in generating high-affinity ligands and accurately capture realistic protein pocket conformational changes.

Paper Structure

This paper contains 42 sections, 14 equations, 10 figures, 6 tables, 2 algorithms.

Figures (10)

  • Figure 1: Conformational changes between the apo (unbound) and holo (ligand-bound) states, illustrating ligand-induced structural dynamics.
  • Figure 2: Schematic overview of Apo2Mol. The diffusion process gradually corrupts an experimental holo pocket–ligand pair by injecting noise into the ligand and linearly interpolating the pocket from its holo conformation toward the apo state. The generative process learns to denoise the corrupted inputs and recover the joint distribution of the holo pocket conformations and their corresponding ligand types and poses. Structural deviations between the apo and holo conformations are illustrated in red.
  • Figure 3: Illustration of the Apo2Mol framework, which jointly models ligand generation and protein pocket refinement through hierarchical graph message passing.
  • Figure 4: Distribution comparison for distances of carbon-carbon pairs for ground truth molecules in the test set (gray) and model-generated molecules (color). JSD between two distributions is reported.
  • Figure 5: RMSD distribution between apo and holo, and between apo and generated pockets. JSD denotes the Jensen–Shannon divergence between the two distributions.
  • ...and 5 more figures