Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers

Kiyoung Seong; Seonghyun Park; Seonghwan Kim; Woo Youn Kim; Sungsoo Ahn

Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers

Kiyoung Seong, Seonghyun Park, Seonghwan Kim, Woo Youn Kim, Sungsoo Ahn

TL;DR

This work tackles the challenge of sampling transition paths between meta-stable states without relying on CVs. It introduces TPS-DPS, which amortizes transition-path sampling by minimizing the log-variance divergence between the diffusion path sampler and the target transition-path distribution, using off-policy training, a replay buffer, and simulated annealing to boost efficiency and diversity. A scale-based, SE(3) equivariant bias-force parameterization and a relaxed, nonlinear indicator improve training signal and scalability to larger systems. Empirical results on a synthetic double-well, Alanine Dipeptide, and four fast-folding proteins show TPS-DPS achieves more realistic and diverse transition pathways than non-ML and ML baselines, highlighting its potential for CV-free TPS in complex biomolecular processes.

Abstract

Understanding transition pathways between two meta-stable states of a molecular system is crucial to advance drug discovery and material design. However, unbiased molecular dynamics (MD) simulations are computationally infeasible because of the high energy barriers that separate these states. Although recent machine learning techniques are proposed to sample rare events, they are often limited to simple systems and rely on collective variables (CVs) derived from costly domain expertise. In this paper, we introduce a novel approach that trains diffusion path samplers (DPS) to address the transition path sampling (TPS) problem without requiring CVs. We reformulate the problem as an amortized sampling from the transition path distribution by minimizing the log-variance divergence between the path distribution induced by DPS and the transition path distribution. Based on the log-variance divergence, we propose learnable control variates to reduce the variance of gradient estimators and the off-policy training objective with replay buffers and simulated annealing techniques to improve sample efficiency and diversity. We also propose a scale-based equivariant parameterization of the bias forces to ensure scalability for large systems. We extensively evaluate our approach, termed TPS-DPS, on a synthetic system, small peptide, and challenging fast-folding proteins, demonstrating that it produces more realistic and diverse transition pathways than existing baselines.

Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers

TL;DR

Abstract

Paper Structure (24 sections, 1 theorem, 32 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 24 sections, 1 theorem, 32 equations, 8 figures, 5 tables, 1 algorithm.

Introduction
Related work
Transition path sampling with diffusion path samplers
Problem setup
Log-variance minimization
Parameterization for large systems
Experiment
Double-well system
Alanine Dipeptide
Fast-folding Proteins
Ablation study
Conclusion
Method details
Log variance formulation
Connection to existing loss functions on discrete-time domain
...and 9 more sections

Key Result

Proposition 1

Consider the molecular state $\bm{R}_{t}$ at the $t$-th time step and the next state $\bm{R}_{t+\Delta t}'= \bm{R}_{t}+\bm{b}(\bm{X}_{t})\Delta t/\bm{m}$ updated by step size $\Delta t$ and the bias force $\bm{b}(\bm{X}_{t})=\text{diag}(\bm{s}_{\theta}(\rho_{t}^{-1} \cdot \bm{X}_{t}))(\rho_{t}\cdot{ where $\rho'_{t+\Delta t}=\operatorname{argmin}_{\rho\in SE(3)}\lVert\rho\cdot\bm{R}_{\mathcal{B}}-

Figures (8)

Figure 1: Problem setup. The sampled transition path (yellow dotted lines) from the state $\mathcal{A}$ to the state $\mathcal{B}$ on the free energy landscape of Alanine Dipeptide. We visualize the snapshots (white circles) of the transition path and the transition state (white star).
Figure 2: Visualization of the bias force fields of two different bias force parameterizations with initialized neural networks. (\ref{['fig:bias_force']}) directly predicting the bias force and (\ref{['fig:bias_scaling']}) predicting the positive scaling factors of direction to the target position (white circle).
Figure 3: Visualization of potential energy landscapes and distributions in double-well potential. (\ref{['subfig:landscape']}) Visualization of the learned bias potential $b_{\theta}$ of TPS-DPS (P). (\ref{['subfig:distribution']}) Distributions of the potential energy and $y$ coordinate of transition states from 1024 transition paths sampled by each method.
Figure 4: Visualization of sampled paths on energy landscapes. For the double-well system, we aim to sample transition paths from the left meta-stable state to the right on the potential energy landscape (top). For Alanine Dipeptide, we aim to sample conformational changes from the $C5$ (upper left) to the $C7ax$ (lower right) on the Ramachandran plot (bottom). White circles and stars indicate meta-stable states and saddle points, respectively.
Figure 5: Qualitative evaluation on transition path sampled from TPS-DPS.(\ref{['subfig:ad-energy-cv']}) Potential energy and the two backbone dihedral angle distances between the current and target states. (\ref{['subfig:chig-energy-cv']}) Potential energy and the two hydrogen bond distances between the current and the target state. (\ref{['subfig:chignolin-bond']}) Visualization of hydrogen bond formation in Chignolin. We highlight each hydrogen bond in green and yellow.
...and 3 more figures

Theorems & Definitions (2)

Proposition 1
proof

Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers

TL;DR

Abstract

Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (2)