Computing Transition Pathways for the Study of Rare Events Using Deep Reinforcement Learning
Bo Lin, Yangzheng Zhong, Weiqing Ren
TL;DR
This work tackles the challenge of computing transition pathways for rare events in high-dimensional systems with rough energy landscapes by reframing the problem as a cost-minimization task over a path space and introducing an effective-force extension of the Freidlin-Wentzell functional. It then solves the resulting optimization with an actor-critic reinforcement learning method based on deep deterministic policy gradient, incorporating physical invariances via a transformation $\mathcal{T}$ and using a continuous, stochastic policy to explore transition regions. The authors demonstrate the approach on a 2D system, a 10D extended Mueller potential, and a Lennard-Jones cluster, showing convergence to globally optimal pathways, robustness to landscape roughness, and agreement with established reference pathways. The method offers a scalable, exploration-driven alternative to traditional path-finding methods and holds promise for enabling accurate transition-path analyses in complex molecular and materials systems.
Abstract
Understanding the transition events between metastable states in complex systems is an important subject in the fields of computational physics, chemistry and biology. The transition pathway plays an important role in characterizing the mechanism underlying the transition, for example, in the study of conformational changes of bio-molecules. In fact, computing the transition pathway is a challenging task for complex and high-dimensional systems. In this work, we formulate the path-finding task as a cost minimization problem over a particular path space. The cost function is adapted from the Freidlin-Wentzell action functional so that it is able to deal with rough potential landscapes. The path-finding problem is then solved using a actor-critic method based on the deep deterministic policy gradient algorithm (DDPG). The method incorporates the potential force of the system in the policy for generating episodes and combines physical properties of the system with the learning process for molecular systems. The exploitation and exploration nature of reinforcement learning enables the method to efficiently sample the transition events and compute the globally optimal transition pathway. We illustrate the effectiveness of the proposed method using three benchmark systems including an extended Mueller system and the Lennard-Jones system of seven particles.
