The fast committor machine: Interpretable prediction with kernels

D. Aristoff, M. Johnson, G. Simpson, R. J. Webber

TL;DR

The paper addresses estimating the forward committor $q^*(oldsymbol{x})= obreakoldsymbol{P}_{oldsymbol{x}}(T_B<T_A)$ from trajectory data in metastable Markov systems. It introduces the fast committor machine (FCM), a kernel-based, interpretable estimator that uses an adaptive linear map $oldsymbol{M}$ learned via the Recursive Feature Machine and coefficients $oldsymbol{ heta}$ learned via randomly pivoted Cholesky, achieving linear-in-$N$ training cost and revealing low-dimensional active subspaces. Key contributions include (i) extending RFM to the committor with a new kernel form that enforces boundary conditions and emphasizes $A o B$ transitions; (ii) demonstrating superior accuracy and faster training than a neural network with the same parameter budget on triple-wwell and alanine dipeptide; (iii) showing that the learned $oldsymbol{M}^{1/2}$ identifies a small active subspace (often two-dimensional) that dominates committor gradients. This approach enables scalable, interpretable committor estimation in high-dimensional molecular systems and motivates adaptive sampling strategies to further improve data efficiency.

Abstract

In the study of stochastic systems, the committor function describes the probability that a system starting from an initial configuration $x$ will reach a set $B$ before a set $A$. This paper introduces an efficient and interpretable algorithm for approximating the committor, called the "fast committor machine" (FCM). The FCM uses simulated trajectory data to build a kernel-based model of the committor. The kernel function is constructed to emphasize low-dimensional subspaces that optimally describe the $A$ to $B$ transitions. The coefficients in the kernel model are determined using randomized linear algebra, leading to a runtime that scales linearly in the number of data points. In numerical experiments involving a triple-well potential and alanine dipeptide, the FCM yields higher accuracy and trains more quickly than a neural network with the same number of parameters. The FCM is also more interpretable than the neural net.

The fast committor machine: Interpretable prediction with kernels

TL;DR

The paper addresses estimating the forward committor

from trajectory data in metastable Markov systems. It introduces the fast committor machine (FCM), a kernel-based, interpretable estimator that uses an adaptive linear map

learned via the Recursive Feature Machine and coefficients

learned via randomly pivoted Cholesky, achieving linear-in-

training cost and revealing low-dimensional active subspaces. Key contributions include (i) extending RFM to the committor with a new kernel form that enforces boundary conditions and emphasizes

transitions; (ii) demonstrating superior accuracy and faster training than a neural network with the same parameter budget on triple-wwell and alanine dipeptide; (iii) showing that the learned

identifies a small active subspace (often two-dimensional) that dominates committor gradients. This approach enables scalable, interpretable committor estimation in high-dimensional molecular systems and motivates adaptive sampling strategies to further improve data efficiency.

Abstract

In the study of stochastic systems, the committor function describes the probability that a system starting from an initial configuration

will reach a set

before a set

. This paper introduces an efficient and interpretable algorithm for approximating the committor, called the "fast committor machine" (FCM). The FCM uses simulated trajectory data to build a kernel-based model of the committor. The kernel function is constructed to emphasize low-dimensional subspaces that optimally describe the

to

transitions. The coefficients in the kernel model are determined using randomized linear algebra, leading to a runtime that scales linearly in the number of data points. In numerical experiments involving a triple-well potential and alanine dipeptide, the FCM yields higher accuracy and trains more quickly than a neural network with the same number of parameters. The FCM is also more interpretable than the neural net.

Paper Structure (17 sections, 2 theorems, 39 equations, 7 figures, 1 table, 2 algorithms)

This paper contains 17 sections, 2 theorems, 39 equations, 7 figures, 1 table, 2 algorithms.

Table of Contents

Introduction
Relationship to past work
Outline for the paper
Assumptions and notation
Context
New FCM method for calculating the committor
Form of the committor approximation
Optimization of the scaling matrix $\bm{M}$
Optimization of the coefficients $\bm{\theta}_n$
Optimization of the hyperparameters
Numerical results
Neural network comparison
Triple-well potential energy
Alanine dipeptide
Conclusion
...and 2 more sections

Key Result

Theorem A.1

Define the least-squares loss function where the positive semidefinite kernel matrix consists of four blocks with entries Then, the loss function $\bm{L}_{\gamma}(\bm{c}, \bm{d})$ has a minimizer that satisfies $\bm{c} + \bm{d} = \bm{0}$.

Figures (7)

Figure 1: (a) The potential function $V_0$. (b) The reference committor evaluated on the validation data points with states $A$ and $B$ and the committor one-half surface indicated in red.
Figure 2: Comparison of neural net (NN) and FCM performance for the triple-well system, with standard error bars computed from $10$ independent runs of the FCM. (a) Mean squared error computed using the reference committor. (b) Runtime in seconds.
Figure 3: Training performance for single instances of the triple-well experiment, with error computed using the reference committor from Fig. \ref{['fig:potential']}. (a) The FCM with different approximation ranks $r$. (b) The neural net with different learning rates $lr$.
Figure 4: Square root of scaling matrix for the triple-well system when $N = 10^6$ and $r = 1000$. (a) After 1 iteration. (b) After 5 iterations, corresponding to convergence.
Figure 5: (a) Free energy surface of alanine dipeptide in $\phi$ and $\psi$ coordinates, compared to reference committor half-surface (red). (b) The reference committor in $\phi$ and $\psi$ coordinates with states $A$ and $B$ and the committor one-half surface indicated in red.
...and 2 more figures

Theorems & Definitions (4)

Theorem A.1
proof
Proposition B.1
proof