Table of Contents
Fetching ...

Stochastic Optimal Control for Collective Variable Free Sampling of Molecular Transition Paths

Lars Holdijk, Yuanqi Du, Ferry Hooft, Priyank Jaini, Bernd Ensing, Max Welling

TL;DR

The paper tackles sampling molecular transition paths between metastable states under high-energy barriers where conventional MD struggles. It proposes PIPS, a CV-free approach that casts the problem as a Schrödinger Bridge and solves it via stochastic optimal control, specifically using Path Integral Control and the PICE framework to learn a bias that guides trajectories toward target states. The method is adapted for molecular dynamics with bias potential or force representations, smoothed loss functions, and integration with OpenMM, and it is demonstrated on Alanine Dipeptide, Polyproline, and Chignolin, showing CV-free success and competitive alignment with CV-based baselines. This CV-agnostic approach enables scalable transition-path sampling without relying on expert CV selection, potentially accelerating exploration of conformational changes in larger biomolecular systems.

Abstract

We consider the problem of sampling transition paths between two given metastable states of a molecular system, e.g. a folded and unfolded protein or products and reactants of a chemical reaction. Due to the existence of high energy barriers separating the states, these transition paths are unlikely to be sampled with standard Molecular Dynamics (MD) simulation. Traditional methods to augment MD with a bias potential to increase the probability of the transition rely on a dimensionality reduction step based on Collective Variables (CVs). Unfortunately, selecting appropriate CVs requires chemical intuition and traditional methods are therefore not always applicable to larger systems. Additionally, when incorrect CVs are used, the bias potential might not be minimal and bias the system along dimensions irrelevant to the transition. Showing a formal relation between the problem of sampling molecular transition paths, the Schrödinger bridge problem and stochastic optimal control with neural network policies, we propose a machine learning method for sampling said transitions. Unlike previous non-machine learning approaches our method, named PIPS, does not depend on CVs. We show that our method successful generates low energy transitions for Alanine Dipeptide as well as the larger Polyproline and Chignolin proteins.

Stochastic Optimal Control for Collective Variable Free Sampling of Molecular Transition Paths

TL;DR

The paper tackles sampling molecular transition paths between metastable states under high-energy barriers where conventional MD struggles. It proposes PIPS, a CV-free approach that casts the problem as a Schrödinger Bridge and solves it via stochastic optimal control, specifically using Path Integral Control and the PICE framework to learn a bias that guides trajectories toward target states. The method is adapted for molecular dynamics with bias potential or force representations, smoothed loss functions, and integration with OpenMM, and it is demonstrated on Alanine Dipeptide, Polyproline, and Chignolin, showing CV-free success and competitive alignment with CV-based baselines. This CV-agnostic approach enables scalable transition-path sampling without relying on expert CV selection, potentially accelerating exploration of conformational changes in larger biomolecular systems.

Abstract

We consider the problem of sampling transition paths between two given metastable states of a molecular system, e.g. a folded and unfolded protein or products and reactants of a chemical reaction. Due to the existence of high energy barriers separating the states, these transition paths are unlikely to be sampled with standard Molecular Dynamics (MD) simulation. Traditional methods to augment MD with a bias potential to increase the probability of the transition rely on a dimensionality reduction step based on Collective Variables (CVs). Unfortunately, selecting appropriate CVs requires chemical intuition and traditional methods are therefore not always applicable to larger systems. Additionally, when incorrect CVs are used, the bias potential might not be minimal and bias the system along dimensions irrelevant to the transition. Showing a formal relation between the problem of sampling molecular transition paths, the Schrödinger bridge problem and stochastic optimal control with neural network policies, we propose a machine learning method for sampling said transitions. Unlike previous non-machine learning approaches our method, named PIPS, does not depend on CVs. We show that our method successful generates low energy transitions for Alanine Dipeptide as well as the larger Polyproline and Chignolin proteins.
Paper Structure (35 sections, 3 theorems, 21 equations, 5 figures, 1 table, 1 algorithm)

This paper contains 35 sections, 3 theorems, 21 equations, 5 figures, 1 table, 1 algorithm.

Key Result

Theorem 3.1

Let $b$ be the set of functions such that $\pi_0 = \pi_G(\boldsymbol{x}_0) \cdot \mathbf{1}_{R}(\boldsymbol{r}_0)$ and $\pi_\tau = \pi_G(\boldsymbol{x}_\tau) \cdot \mathbf{1}_{R}(\boldsymbol{r}_\tau)$, we have that a solution to the SB problem with reference distribution $\pi^*$ is also a solution t

Figures (5)

  • Figure 1: Free-energy surface of Alanine Dipeptide as a function of CV dihedral angles $\phi$ and $\psi$ highlighting the high energy barrier separating the two metastable states. White stars indicate saddle points in the high energy barrier where the transition is likely to occur.
  • Figure 2: Visualization of a trajectory sampled with PIPS. Left: The sampled trajectory projected on the free energy landscape of AD as a function of two CVs Right: Conformations along the sampled trajectory: A) starting conformation showing the CV dihedral angles, B-D) intermediate conformations with C being the highest energy point on the trajectory, and E) final conformation, which closely aligns with the target conformation. Bottom: Potential energy during transition.
  • Figure 3: Visualization of the Polyproline transformation from PP-II to PP-I. From-top-to-bottom 5 stages of the transition, $\psi$, $\phi$, $\omega$ candidate CVs, and Potential Energy. For the candidate CVs multiple instances of the same dihedral angles can be found in a single molecule. Stars indicate target candidate CV states. Colored bonds represent the bonds involved in the $\omega$ CV.
  • Figure 4: Visualization of the Chignolin folding process. Top: 5 stages of the folding process, Middle: Pairwise distance wrt to the target conformation of the molecule, Bottom: Potential Energy.
  • Figure 5: Visualization of a trajectory sampled with the proposed force prediction method. Left: The sampled trajectory projected on the free energy landscape of Alanine Dipeptide as a function of two CVs Right: Conformations along the sampled trajectory: A) starting conformation showing the CV dihedral angles, B-D) intermediate conformations with D being the highest energy point on the trajectory, and E) final conformation, which closely aligns with the target conformation. Bottom: Potential energy during transition. Letters represent the same configurations in the transition.

Theorems & Definitions (9)

  • Definition 1: Transition Path (TP) distribution
  • Definition 2: BPTP problem
  • Definition 3: Schrodinger Bridge (SB) problem
  • Theorem 3.1: BPTP problem is a SB problem
  • proof
  • Theorem 3.2: SOC solves the BPTP problem
  • proof
  • Theorem A.1: SOC solves the BPTP problem
  • proof