Table of Contents
Fetching ...

Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows

Xiangxin Zhou, Yi Xiao, Haowei Lin, Xinheng He, Jiaqi Guan, Yang Wang, Qiang Liu, Feng Zhou, Liang Wang, Jianzhu Ma

TL;DR

This paper tackles the challenge of incorporating protein pocket dynamics into structure-based drug design by introducing DynamicFlow, a full-atom, SE(3)-equivariant flow framework trained on MD-derived apo-holo pairs to jointly transform apo pockets into holo conformations and generate binding ligands. The method employs both continuous-flow matching for geometric coordinates and discrete-flow matching for ligand bond types, extended to a stochastic ODE/SDE variant for robustness, and built on a multiscale architecture that combines atom-level EGNNs with residue-level Transformers. Key contributions include a meticulously curated MISATO-based dataset, a full-atom flow model capable of capturing backbone translations, side-chain torsions, and ligand topology, and demonstrations that generated holo-like pockets improve the performance of traditional SBDD methods while yielding promising ligands with favorable pharmacokinetic properties. The approach advances practical drug discovery by providing physically informed holo-pocket inputs and end-to-end generative capabilities that account for protein dynamics and induced-fit effects.

Abstract

The dynamic nature of proteins, influenced by ligand interactions, is essential for comprehending protein function and progressing drug discovery. Traditional structure-based drug design (SBDD) approaches typically target binding sites with rigid structures, limiting their practical application in drug development. While molecular dynamics simulation can theoretically capture all the biologically relevant conformations, the transition rate is dictated by the intrinsic energy barrier between them, making the sampling process computationally expensive. To overcome the aforementioned challenges, we propose to use generative modeling for SBDD considering conformational changes of protein pockets. We curate a dataset of apo and multiple holo states of protein-ligand complexes, simulated by molecular dynamics, and propose a full-atom flow model (and a stochastic version), named DynamicFlow, that learns to transform apo pockets and noisy ligands into holo pockets and corresponding 3D ligand molecules. Our method uncovers promising ligand molecules and corresponding holo conformations of pockets. Additionally, the resultant holo-like states provide superior inputs for traditional SBDD approaches, playing a significant role in practical drug discovery.

Integrating Protein Dynamics into Structure-Based Drug Design via Full-Atom Stochastic Flows

TL;DR

This paper tackles the challenge of incorporating protein pocket dynamics into structure-based drug design by introducing DynamicFlow, a full-atom, SE(3)-equivariant flow framework trained on MD-derived apo-holo pairs to jointly transform apo pockets into holo conformations and generate binding ligands. The method employs both continuous-flow matching for geometric coordinates and discrete-flow matching for ligand bond types, extended to a stochastic ODE/SDE variant for robustness, and built on a multiscale architecture that combines atom-level EGNNs with residue-level Transformers. Key contributions include a meticulously curated MISATO-based dataset, a full-atom flow model capable of capturing backbone translations, side-chain torsions, and ligand topology, and demonstrations that generated holo-like pockets improve the performance of traditional SBDD methods while yielding promising ligands with favorable pharmacokinetic properties. The approach advances practical drug discovery by providing physically informed holo-pocket inputs and end-to-end generative capabilities that account for protein dynamics and induced-fit effects.

Abstract

The dynamic nature of proteins, influenced by ligand interactions, is essential for comprehending protein function and progressing drug discovery. Traditional structure-based drug design (SBDD) approaches typically target binding sites with rigid structures, limiting their practical application in drug development. While molecular dynamics simulation can theoretically capture all the biologically relevant conformations, the transition rate is dictated by the intrinsic energy barrier between them, making the sampling process computationally expensive. To overcome the aforementioned challenges, we propose to use generative modeling for SBDD considering conformational changes of protein pockets. We curate a dataset of apo and multiple holo states of protein-ligand complexes, simulated by molecular dynamics, and propose a full-atom flow model (and a stochastic version), named DynamicFlow, that learns to transform apo pockets and noisy ligands into holo pockets and corresponding 3D ligand molecules. Our method uncovers promising ligand molecules and corresponding holo conformations of pockets. Additionally, the resultant holo-like states provide superior inputs for traditional SBDD approaches, playing a significant role in practical drug discovery.

Paper Structure

This paper contains 26 sections, 28 equations, 16 figures, 9 tables.

Figures (16)

  • Figure 1: Comparison of Abl kinase domain conformations. In the top panel, the transition between the apo active and apo inactive conformations is shown. In the bottom panel, the active conformation with Dasatinib bound (DFG-in state) is compared to the inactive conformation with Imatinib bound (DFG-out state). The transformations between these states highlight the structural shifts critical for ligand binding.
  • Figure 2: Overview of DynamicFlow. (a) Our dataset consists of apo and multiple holo states of protein-ligand complexes derived from molecular dynamics simulation. (b) Our flow models, DynamicFlow-ODE and DynamicFlow-SDE, the generative process of ligand molecules along with the protein dynamics from apo to holo. The protein pocket is represented as both (i) residue frames and side-chain torsions and (ii) full atoms. The ligand molecule is represented as atom types, bond types, and atom positions.
  • Figure 3: Illustration of our multiscale full-atom model architecture: (a) atom-level SE(3)-equivariant graph neural network; (b) residue-level Transformers.
  • Figure 4: Distribution of differences in protein-ligand non-covalent interaction numbers of apo/our pockets and ligands designed by TargetDiff from those of ground-truth holo-ligand complexes.
  • Figure 5: Cover Ratio and minimum RMSD against holo states along the number of samples.
  • ...and 11 more figures