Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

Rohit Goswami

Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

Rohit Goswami

TL;DR

A unified Bayesian Optimization view of minimization, single point saddle searches, and double ended saddle searches through a unified six-step surrogate loop, differing only in the inner optimization target and acquisition criterion is presented.

Abstract

Accelerating the explorations of stationary points on potential energy surfaces building local surrogates spans decades of effort. Done correctly, surrogates reduce required evaluations by an order of magnitude while preserving the accuracy of the underlying theory. We present a unified Bayesian Optimization view of minimization, single point saddle searches, and double ended saddle searches through a unified six-step surrogate loop, differing only in the inner optimization target and acquisition criterion. The framework uses Gaussian process regression with derivative observations, inverse-distance kernels, and active learning. The Optimal Transport GP extensions of farthest point sampling with Earth mover's distance, MAP regularization via variance barrier and oscillation detection, and adaptive trust radius form concrete extensions of the same basic methodology, improving accuracy and efficiency. We also demonstrate random Fourier features decouple hyperparameter training from predictions enabling favorable scaling for high-dimensional systems. Accompanying pedagogical Rust code demonstrates that all applications use the exact same Bayesian optimization loop, bridging the gap between theoretical formulation and practical execution.

Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

TL;DR

Abstract

Paper Structure (50 sections, 60 equations, 22 figures, 6 tables, 6 algorithms)

This paper contains 50 sections, 60 equations, 22 figures, 6 tables, 6 algorithms.

Introduction
The Potential Energy Surface and Stationary Point Searches
The PES and Its Stationary Points
Local Minimization
Minimum Mode Following, the Dimer Method
Rotation
Translation
Double-Ended Path Methods
The Nudged Elastic Band
The Climbing Image and Its Connection to Minimum Mode Following
Gaussian Process Regression
What the Surrogate Must Provide
Regression with Derivative Observations
Covariance Functions for Molecular Systems
The Inverse-Distance Squared Exponential Kernel
...and 35 more sections

Figures (22)

Figure 1: Geometry of the dimer and Householder reflection. The dimer pair ($\mathbf{R}_1, \mathbf{R}_2$) straddles the midpoint $\mathbf{R}_0$ with axis $\hat{\mathbf{N}}$. The true force $\mathbf{F}$ (blue) is reflected about the hyperplane perpendicular to $\hat{\mathbf{N}}$, producing the modified force $\mathbf{F}^{\dagger}$ (coral) that climbs along the minimum mode while relaxing perpendicular to it.
Figure 2: GP conditioning in three panels. (Left) Before any data, the prior (Eq. \ref{['eq:gp_prior']}) admits a wide family of smooth functions. (Center) Oracle evaluations supply energies and forces at selected configurations. (Right) Conditioning on the data collapses the posterior near training points while preserving wide uncertainty elsewhere; the posterior mean serves as the surrogate surface $V_{\text{GP}}$.
Figure 3: GP surrogate fidelity as a function of training set size on the Muller-Brown surface. Each panel shows the GP posterior mean contours after training on $M = 5, 15, 30, 50$ Latin hypercube-sampled configurations (black dots). With 5 points the surrogate captures only crude basin structure; by 30 points the contours closely match the true PES (Figure \ref{['fig:mb_neb']}) in the sampled region.
Figure 4: GP predictive variance on the Muller-Brown surface after 20 training evaluations clustered near minimum A and saddle S1 (black dots). The variance is near zero close to training data and grows with distance, reaching a maximum (coral diamond) in the unexplored region. This variance landscape is the basis for active learning: the next electronic structure evaluation is placed where the GP is least certain.
Figure 5: Block structure of the full covariance matrix $K_{\text{full}}$. The base kernel in feature space generates four Cartesian-space blocks through differentiation via the feature Jacobian J. Darker shading indicates higher computational cost.
...and 17 more figures

Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

TL;DR

Abstract

Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

Authors

TL;DR

Abstract

Table of Contents

Figures (22)