Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
Kiyoung Seong, Seonghyun Park, Seonghwan Kim, Woo Youn Kim, Sungsoo Ahn
TL;DR
This work tackles the challenge of sampling transition paths between meta-stable states without relying on CVs. It introduces TPS-DPS, which amortizes transition-path sampling by minimizing the log-variance divergence between the diffusion path sampler and the target transition-path distribution, using off-policy training, a replay buffer, and simulated annealing to boost efficiency and diversity. A scale-based, SE(3) equivariant bias-force parameterization and a relaxed, nonlinear indicator improve training signal and scalability to larger systems. Empirical results on a synthetic double-well, Alanine Dipeptide, and four fast-folding proteins show TPS-DPS achieves more realistic and diverse transition pathways than non-ML and ML baselines, highlighting its potential for CV-free TPS in complex biomolecular processes.
Abstract
Understanding transition pathways between two meta-stable states of a molecular system is crucial to advance drug discovery and material design. However, unbiased molecular dynamics (MD) simulations are computationally infeasible because of the high energy barriers that separate these states. Although recent machine learning techniques are proposed to sample rare events, they are often limited to simple systems and rely on collective variables (CVs) derived from costly domain expertise. In this paper, we introduce a novel approach that trains diffusion path samplers (DPS) to address the transition path sampling (TPS) problem without requiring CVs. We reformulate the problem as an amortized sampling from the transition path distribution by minimizing the log-variance divergence between the path distribution induced by DPS and the transition path distribution. Based on the log-variance divergence, we propose learnable control variates to reduce the variance of gradient estimators and the off-policy training objective with replay buffers and simulated annealing techniques to improve sample efficiency and diversity. We also propose a scale-based equivariant parameterization of the bias forces to ensure scalability for large systems. We extensively evaluate our approach, termed TPS-DPS, on a synthetic system, small peptide, and challenging fast-folding proteins, demonstrating that it produces more realistic and diverse transition pathways than existing baselines.
