On finding optimal collective variables for complex systems by minimizing the deviation between effective and full dynamics

Wei Zhang; Christof Schütte

On finding optimal collective variables for complex systems by minimizing the deviation between effective and full dynamics

Wei Zhang, Christof Schütte

TL;DR

The paper tackles the problem of finding optimal low-dimensional collective variables for complex, high-dimensional Markov dynamics by introducing an effective dynamics framework and an objective based on relative entropy. It proves that, for a fixed CV map, the corresponding lower-dimensional process is the KL-minimizing surrogate to the full transition density, and it characterizes how the choice of CV affects timescales and transition rates through variational principles. The work provides explicit error bounds connecting the spectra of the full transfer operator to those of the effective operator and shows that when eigenfunctions (or committors) factor through the CV, the corresponding timescales and rates are preserved. It also demonstrates meaningful links to data-driven methods (e.g., VAMPnets, MSMs, normalizing flows) and explains how these approaches implicitly learn quantities of the effective dynamics, guiding the design of new CV-learning algorithms for molecular kinetics and other complex systems. Overall, the results establish a rigorous framework for CV selection and effective-model construction with implications for large-time simulations and the development of improved, theory-informed data-driven methods.

Abstract

This paper is concerned with collective variables, or reaction coordinates, that map a discrete-in-time Markov process $X_n$ in $\mathbb{R}^d$ to a (much) smaller dimension $k \ll d$. We define the effective dynamics under a given collective variable map $ξ$ as the best Markovian representation of $X_n$ under $ξ$. The novelty of the paper is that it gives strict criteria for selecting optimal collective variables via the properties of the effective dynamics. In particular, we show that the transition density of the effective dynamics of the optimal collective variable solves a relative entropy minimization problem from certain family of densities to the transition density of $X_n$. We also show that many transfer operator-based data-driven numerical approaches essentially learn quantities of the effective dynamics. Furthermore, we obtain various error estimates for the effective dynamics in approximating dominant timescales / eigenvalues and transition rates of the original process $X_n$ and how optimal collective variables minimize these errors. Our results contribute to the development of theoretical tools for the understanding of complex dynamical systems, e.g. molecular kinetics, on large timescales. These results shed light on the relations among existing data-driven numerical approaches for identifying good collective variables, and they also motivate the development of new methods.

On finding optimal collective variables for complex systems by minimizing the deviation between effective and full dynamics

TL;DR

Abstract

This paper is concerned with collective variables, or reaction coordinates, that map a discrete-in-time Markov process

to a (much) smaller dimension

. We define the effective dynamics under a given collective variable map

as the best Markovian representation of

under

. The novelty of the paper is that it gives strict criteria for selecting optimal collective variables via the properties of the effective dynamics. In particular, we show that the transition density of the effective dynamics of the optimal collective variable solves a relative entropy minimization problem from certain family of densities to the transition density of

. We also show that many transfer operator-based data-driven numerical approaches essentially learn quantities of the effective dynamics. Furthermore, we obtain various error estimates for the effective dynamics in approximating dominant timescales / eigenvalues and transition rates of the original process

and how optimal collective variables minimize these errors. Our results contribute to the development of theoretical tools for the understanding of complex dynamical systems, e.g. molecular kinetics, on large timescales. These results shed light on the relations among existing data-driven numerical approaches for identifying good collective variables, and they also motivate the development of new methods.

Paper Structure (17 sections, 18 theorems, 110 equations, 3 figures)

This paper contains 17 sections, 18 theorems, 110 equations, 3 figures.

Introduction
Main results
Algorithmic implications
Transfer operator approach
Definitions and basic properties
Spectrum and timescales
Transition rates
Effective dynamics for a given CV map
Definitions
Properties of effective dynamics
Error estimates for effective dynamics
Timescales
Transition rates
Comparison under transformations
Conclusion and Discussions
...and 2 more sections

Key Result

Lemma 1

For any $f,h \in \mathcal{H}$, we have where $\mathcal{I}$ and $\mathcal{T}^{rev}$ denote the identity operator and the reversible part of $\mathcal{T}$ in tran-decomp-2, respectively. Let $X_0, X_1,X_2, \dots$ be an infinitely long trajectory of the process $X_n$. Then, we have almost surely

Figures (3)

Figure 1: Illustration of level sets of the CV map $\xi: \mathbb{R}^d\rightarrow \mathbb{R}^k$ for values $z,w\in \mathbb{R}^k$.
Figure 2: Top: illustration of transitions of the process $X_n$ from set $A$ to set $B$ in $\mathbb{R}^d$. Bottom: illustration of transitions of the system $Z_n$ from the corresponding set $\widetilde{A}$ to set $\widetilde{B}$ in $\mathbb{R}^k$ (see \ref{['set-ab-and-eff']} for the relations between the sets). The reactive segments are highlighted in red.
Figure 3: Commutative diagram of effective dynamics. The effective dynamics $Z_n'$ of $X_n$ associated to the CV map $\xi'$ is the effective dynamics of (the effective dynamics) $Z_n$ associated to the CV map $f$.

Theorems & Definitions (44)

Lemma 1
proof
Remark 2.1: Alternative operators and terminologies
Example 2.1
Proposition 1
Proposition 2
proof
Remark 2.2: Positivity of eigenvalues
Theorem 2.2
Remark 2.3: Connections to VAMP
...and 34 more

On finding optimal collective variables for complex systems by minimizing the deviation between effective and full dynamics

TL;DR

Abstract

On finding optimal collective variables for complex systems by minimizing the deviation between effective and full dynamics

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (44)