Table of Contents
Fetching ...

F$^3$low: Frame-to-Frame Coarse-grained Molecular Dynamics with SE(3) Guided Flow Matching

Shaoning Li, Yusong Wang, Mingyu Li, Jian Zhang, Bin Shao, Nanning Zheng, Jian Tang

TL;DR

The paper tackles the challenge of efficiently exploring protein conformational space in molecular dynamics by marrying coarse-grained MD with generative modeling on the $SE(3)^N$ manifold. It introduces F$^3$low, a frame-to-frame diffusion-like model with SE(3) guided flow matching that generates successive backbone frames conditioned on the previous frame, using geodesic interpolation between frames and a SE(3) conditional flow matching objective. Across Chignolin, Trpcage, and Homeodomain, CG-F$^3$low achieves broader exploration of the free energy surface and preserves unstructured states better than CG-MLFF, with backbone RMSDs comparable to reference MD. This SE(3)-aware generative sampling enables efficient exploration of conformational landscapes and lays the groundwork for future all-atom extensions and inclusion of side-chain torsions.

Abstract

Molecular dynamics (MD) is a crucial technique for simulating biological systems, enabling the exploration of their dynamic nature and fostering an understanding of their functions and properties. To address exploration inefficiency, emerging enhanced sampling approaches like coarse-graining (CG) and generative models have been employed. In this work, we propose a \underline{Frame-to-Frame} generative model with guided \underline{Flow}-matching (F$3$low) for enhanced sampling, which (a) extends the domain of CG modeling to the SE(3) Riemannian manifold; (b) retreating CGMD simulations as autoregressively sampling guided by the former frame via flow-matching models; (c) targets the protein backbone, offering improved insights into secondary structure formation and intricate folding pathways. Compared to previous methods, F$3$low allows for broader exploration of conformational space. The ability to rapidly generate diverse conformations via force-free generative paradigm on SE(3) paves the way toward efficient enhanced sampling methods.

F$^3$low: Frame-to-Frame Coarse-grained Molecular Dynamics with SE(3) Guided Flow Matching

TL;DR

The paper tackles the challenge of efficiently exploring protein conformational space in molecular dynamics by marrying coarse-grained MD with generative modeling on the manifold. It introduces Flow, a frame-to-frame diffusion-like model with SE(3) guided flow matching that generates successive backbone frames conditioned on the previous frame, using geodesic interpolation between frames and a SE(3) conditional flow matching objective. Across Chignolin, Trpcage, and Homeodomain, CG-Flow achieves broader exploration of the free energy surface and preserves unstructured states better than CG-MLFF, with backbone RMSDs comparable to reference MD. This SE(3)-aware generative sampling enables efficient exploration of conformational landscapes and lays the groundwork for future all-atom extensions and inclusion of side-chain torsions.

Abstract

Molecular dynamics (MD) is a crucial technique for simulating biological systems, enabling the exploration of their dynamic nature and fostering an understanding of their functions and properties. To address exploration inefficiency, emerging enhanced sampling approaches like coarse-graining (CG) and generative models have been employed. In this work, we propose a \underline{Frame-to-Frame} generative model with guided \underline{Flow}-matching (Flow) for enhanced sampling, which (a) extends the domain of CG modeling to the SE(3) Riemannian manifold; (b) retreating CGMD simulations as autoregressively sampling guided by the former frame via flow-matching models; (c) targets the protein backbone, offering improved insights into secondary structure formation and intricate folding pathways. Compared to previous methods, Flow allows for broader exploration of conformational space. The ability to rapidly generate diverse conformations via force-free generative paradigm on SE(3) paves the way toward efficient enhanced sampling methods.
Paper Structure (15 sections, 7 equations, 4 figures, 4 tables, 2 algorithms)

This paper contains 15 sections, 7 equations, 4 figures, 4 tables, 2 algorithms.

Figures (4)

  • Figure 1: Overview of the enhanced sampling methods, traditional CGMD pipeline (panel a) and generative models (e.g., F$^3$low, panel b). Traditional CGMD relies on empirical forces applied to each CG bead to calculate the next frame. In contrast, generative models operate by directly sampling the next frame from a prior distribution, bypassing the need for explicit force calculations.
  • Figure 2: Comparison of free energy surface across reference, CG-MLFF and CG-F$^3$low simulations. The crystal structure (gray) and the lowest RMSD structure in CG-F$^3$low simuluation (colored) are presented on the left, respectively. The corresponding Min. RMSD can be found in Table \ref{['table:min_rmsd']}.
  • Figure 3: Individual trajectory visualization for Chignolin, Trpcage and Homeodomain. In each visual representation, the simulation trajectory traverses the free energy landscape, depicted in a gradient from purple to yellow.
  • Figure 4: Simulation analysis of Homeodomain. a. The free energy surface of Homeodomain mapped onto the first two principal components derived from tICA for the reference all-atom MD simulations (left), the CG-MLFF simulations (center) and the CG-F$^3$low simulations (right). Distinct macrostates on the landscape are highlighted by circles in different colors: cyan for the initial state, orange for the intermediate state, and pink for the native state. b. The propensity of the three secondary structural elements of the Homeodomain across the macrostates, was measured by the percentage of conformational ensembles with an RMSD threshold of 2$\textup{\AA}$ for each helix. c. The representative conformations from the macrostates identified in the CG-F$^3$low simulations correspond to the free energy minima indicated by the same color coding. Transparent structures reveal additional diverse conformations from the same state. Arrows represent the main pathways transitioning from the initial unstructured state to the native folded state.