Convergence of variational Monte Carlo simulation and scale-invariant pre-training

Nilin Abrahamsen; Zhiyan Ding; Gil Goldshlager; Lin Lin

Convergence of variational Monte Carlo simulation and scale-invariant pre-training

Nilin Abrahamsen, Zhiyan Ding, Gil Goldshlager, Lin Lin

TL;DR

The paper addresses convergence for variational Monte Carlo applied to neural-network wave functions in electronic structure by analyzing both energy minimization and scale-invariant supervised pre-training. It leverages the scale-invariant Rayleigh quotient and introduces a directionally unbiased gradient estimator to prove convergence bounds for SGD-like updates with MCMC sampling. A scale-invariant loss is proposed for pre-training, with theoretical guarantees mirroring nonconvex SGD rates, and numerical experiments demonstrate faster pre-training and plausible VMC convergence on small, strongly correlated systems. The results suggest scalable, principled guidance for optimizing neural quantum states and point toward extensions to alternative optimization schemes and manifold-based formulations.

Abstract

We provide theoretical convergence bounds for the variational Monte Carlo (VMC) method as applied to optimize neural network wave functions for the electronic structure problem. We study both the energy minimization phase and the supervised pre-training phase that is commonly used prior to energy minimization. For the energy minimization phase, the standard algorithm is scale-invariant by design, and we provide a proof of convergence for this algorithm without modifications. The pre-training stage typically does not feature such scale-invariance. We propose using a scale-invariant loss for the pretraining phase and demonstrate empirically that it leads to faster pre-training.

Convergence of variational Monte Carlo simulation and scale-invariant pre-training

TL;DR

Abstract

Paper Structure (21 sections, 11 theorems, 70 equations, 4 figures)

This paper contains 21 sections, 11 theorems, 70 equations, 4 figures.

Introduction
Related works
SGD on Riemannian manifolds
Concurrent theoretical analysis
Variational Monte Carlo
Relation to the policy gradient method
Supervised Pre-training
Directionally unbiased gradient estimator
Theoretical results
Convergence result for VMC
Scale-invariant supervised pre-training
Numerical results
Pre-training with scale-invariant loss
Empirical convergence of VMC algorithm
Conclusion
...and 6 more sections

Key Result

Theorem 2.1

The gradient of the expected reward is where $R(\tau)$ is the reward of trajectory $\tau$, and $s_t,a_t$ are the states and actions in the trajectory.

Figures (4)

Figure 1: Convergence of supervised pre-training using the scale-invariant training loss (orange) vs. the training loss of vonglehn2023selfattention (blue). For both optimizers the plotted quantity is the sine of the angle between the target state and the trained state, where the angle is defined in $\mathscr L^2(\mathbb R^{3n},\rho=|\varphi|^2)$ with respect to the measure induced by the target state density.
Figure 2: Atomic configuration for the square H$_4$ model.
Figure 3: Convergence of VMC run on the H$_4$ square. The running minimum is taken to smooth out the data and match the form of \ref{['cor:VMC']}.
Figure 4: Lipschitz constant for VMC run on the H$_4$ square with 1000 walkers. The constant is numerically approximated using the formula $|G(\theta_{m+1})- G(\theta_m)|/|\theta_{m+1} - \theta_m|$.

Theorems & Definitions (21)

Theorem 2.1: polgrad
Lemma 4.1
Theorem 4.3
Corollary 4.4
Lemma 4.5
Remark 4.6
Remark 4.7
Theorem 4.9
Corollary 4.10
proof : Proof of \ref{['eqn:gradient_L_VMC']}
...and 11 more

Convergence of variational Monte Carlo simulation and scale-invariant pre-training

TL;DR

Abstract

Convergence of variational Monte Carlo simulation and scale-invariant pre-training

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (21)