Table of Contents
Fetching ...

A Note on Spectral Map

Tuğçe Gökdemir, Jakub Rydzewski

TL;DR

The paper tackles rare-event sampling in MD by introducing spectral map SM, an unsupervised deep learning method to learn slow CVs through maximizing the timescale separation between slow and fast dynamics, expressed as $\sigma=\lambda_{m-1}-\lambda_m$. SM maps configurations to CVs via $\mathbf{z}=\boldsymbol{\xi}_w(\mathbf{x})$ and builds a Markov transition matrix $Q$ in CV space using an anisotropic diffusion kernel derived from a Gaussian kernel $g(\mathbf{z}_k,\mathbf{z}_l)=\exp(-\|\mathbf{z}_k-\mathbf{z}_l\|^2/\varepsilon)$ and density $\rho(\mathbf{z})=\sum_l g(\mathbf{z},\mathbf{z}_l)$. Optimization of $\sigma$ yields neural-network based CVs suitable for CV-based enhanced sampling, providing an automatic, data-driven route to slow variables. The approach connects to diffusion maps and related spectral methods, cites prior work by Rydzewski and colleagues, and provides a PLUMED-NEST implementation while highlighting on-the-fly MD applications for discovering long-timescale processes.

Abstract

In molecular dynamics (MD) simulations, transitions between states are often rare events due to energy barriers that exceed the thermal temperature. Because of their infrequent occurrence and the huge number of degrees of freedom in molecular systems, understanding the physical properties that drive rare events is immensely difficult. A common approach to this problem is to propose a collective variable (CV) that describes this process by a simplified representation. However, choosing CVs is not easy, as it often relies on physical intuition. Machine learning (ML) techniques provide a promising approach for effectively extracting optimal CVs from MD data. Here, we provide a note on a recent unsupervised ML method called spectral map, which constructs CVs by maximizing the timescale separation between slow and fast variables in the system.

A Note on Spectral Map

TL;DR

The paper tackles rare-event sampling in MD by introducing spectral map SM, an unsupervised deep learning method to learn slow CVs through maximizing the timescale separation between slow and fast dynamics, expressed as . SM maps configurations to CVs via and builds a Markov transition matrix in CV space using an anisotropic diffusion kernel derived from a Gaussian kernel and density . Optimization of yields neural-network based CVs suitable for CV-based enhanced sampling, providing an automatic, data-driven route to slow variables. The approach connects to diffusion maps and related spectral methods, cites prior work by Rydzewski and colleagues, and provides a PLUMED-NEST implementation while highlighting on-the-fly MD applications for discovering long-timescale processes.

Abstract

In molecular dynamics (MD) simulations, transitions between states are often rare events due to energy barriers that exceed the thermal temperature. Because of their infrequent occurrence and the huge number of degrees of freedom in molecular systems, understanding the physical properties that drive rare events is immensely difficult. A common approach to this problem is to propose a collective variable (CV) that describes this process by a simplified representation. However, choosing CVs is not easy, as it often relies on physical intuition. Machine learning (ML) techniques provide a promising approach for effectively extracting optimal CVs from MD data. Here, we provide a note on a recent unsupervised ML method called spectral map, which constructs CVs by maximizing the timescale separation between slow and fast variables in the system.

Paper Structure

This paper contains 3 sections, 4 equations, 1 figure, 1 algorithm.

Figures (1)

  • Figure 1: Example of SM learning illustrated on the mini-protein chignolin dataset. (a) Input for SM consisting of conformations from the MD trajectory. (b) High-dimensional data is reduced to the low-dimensional space via the shown NN. The spectral gap $\sigma$ is calculated from the dominant eigenvalues of the transition matrix. The NN is trained iteratively by maximizing the spectral gap and backpropagation. (c) The corresponding free-energy landscape of chignolin shows two distinct free-energy basins: the folded state (F) and the unfolded state (U). The figure is based on results presented in Ref. rydzewski2024learning.