Spectral Map for Slow Collective Variables, Markovian Dynamics, and Transition State Ensembles
Jakub Rydzewski
TL;DR
This work advances spectral map, a data-driven method to learn slow collective variables (CVs) by maximizing a spectral gap of a Markov transition operator, thereby producing a memoryless diffusion description on a free-energy landscape. Applying the framework to FiP35 protein folding, the authors extract a (essentially) one-dimensional slow CV that captures the dominant folding/unfolding timescale and define a transition-state ensemble through kinetic partitions of the CV space. They show the learned CVs approach the Markovian limit for overdamped diffusion, find that coordinate-dependent diffusion only modestly perturbs the free-energy profile, and demonstrate the slow CV's ability to illuminate structurally meaningful regions and key residues driving slow dynamics. The results suggest that spectral map can yield physically interpretable reaction coordinates for complex molecular processes and offer a pathway to analyze feature importance and transitions, with future extensions to biased simulations via reweighting.
Abstract
Understanding the behavior of complex molecular systems is a fundamental problem in physical chemistry. To describe the long-time dynamics of such systems, which is responsible for their most informative characteristics, we can identify a few slow collective variables (CVs) while treating the remaining fast variables as thermal noise. This enables us to simplify the dynamics and treat it as diffusion in a free-energy landscape spanned by slow CVs, effectively rendering the dynamics Markovian. Our recent statistical learning technique, spectral map [Rydzewski, J. Phys. Chem. Lett. 2023, 14, 22, 5216-5220], explores this strategy to learn slow CVs by maximizing a spectral gap of a transition matrix. In this work, we introduce several advancements into our framework, using a high-dimensional reversible folding process of a protein as an example. We implement an algorithm for coarse-graining Markov transition matrices to partition the reduced space of slow CVs kinetically and use it to define a transition state ensemble. We show that slow CVs learned by spectral map closely approach the Markovian limit for an overdamped diffusion. We demonstrate that coordinate-dependent diffusion coefficients only slightly affect the constructed free-energy landscapes. Finally, we present how spectral map can be used to quantify the importance of features and compare slow CVs with structural descriptors commonly used in protein folding. Overall, we demonstrate that a single slow CV learned by spectral map can be used as a physical reaction coordinate to capture essential characteristics of protein folding.
