Reinforcement Learning for Molecular Dynamics Optimization: A Stochastic Pontryagin Maximum Principle Approach
Chandrajit Bajaj, Minh Nguyen, Conrad Li
TL;DR
A novel reinforcement learning framework designed to optimize molecular dynamics by focusing on the entire trajectory rather than just the final molecular configuration, which makes it suitable for applications in areas such as drug discovery and molecular design.
Abstract
In this paper, we present a novel reinforcement learning framework designed to optimize molecular dynamics by focusing on the entire trajectory rather than just the final molecular configuration. Leveraging a stochastic version of Pontryagin's Maximum Principle (PMP) and Soft Actor-Critic (SAC) algorithm, our framework effectively explores non-convex molecular energy landscapes, escaping local minima to stabilize in low-energy states. Our approach operates in continuous state and action spaces without relying on labeled data, making it applicable to a wide range of molecular systems. Through extensive experimentation on six distinct molecules, including Bradykinin and Oxytocin, we demonstrate competitive performance against other unsupervised physics-based methods, such as the Greedy and NEMO-based algorithms. Our method's adaptability and focus on dynamic trajectory optimization make it suitable for applications in areas such as drug discovery and molecular design.
