Removing nodal and support-mismatch pathologies in Variational Monte Carlo via blurred sampling

Zhou-Quan Wan; Roeland Wiersema; Shiwei Zhang

Removing nodal and support-mismatch pathologies in Variational Monte Carlo via blurred sampling

Zhou-Quan Wan, Roeland Wiersema, Shiwei Zhang

Abstract

Variational Monte Carlo (VMC) is a powerful and fast-growing method for optimizing and evolving parameterized many-body wave functions, especially with modern neural-network quantum states. In practice, however, the stochastic estimators that form the backbone of the method can become unstable or biased due to the presence of nodes, a ubiquitous feature of quantum wave functions. In the continuum, this results in heavy-tailed estimators with potentially divergent variances, while in discrete Hilbert spaces the sampling distribution can miss parts of the support needed to form unbiased estimators. These statistical pathologies lead to unreliable optimization trajectories in stochastic reconfiguration or incorrect variational dynamics in time-dependent Variational Monte Carlo (t-VMC), and severely limit the power of the numerical simulations. We introduce blurred sampling to address these difficulties. The method has a number of rigorous properties that make it well-behaved, effective and efficient. Additionally it is a post-processing approach that can be used without modifying the underlying sampler and incurs only minimal overhead. We demonstrate its effectiveness on several representative examples where standard sampling approaches are known to fail, and apply it to large-scale problems in spin dynamics. This work establishes a broadly applicable framework for robust VMC and t-VMC calculations.

Removing nodal and support-mismatch pathologies in Variational Monte Carlo via blurred sampling

Abstract

Paper Structure (29 sections, 65 equations, 7 figures, 2 tables)

This paper contains 29 sections, 65 equations, 7 figures, 2 tables.

Introduction
Background and Preliminaries
Variational Monte Carlo
Statistical Pathologies
Requirements for a Remedy
Blurred sampling method
Construction
Statistical Guarantees and Computational Scaling
Generalizations
Results
Illustration with Previously Identified Difficult Examples
Solution of the Parity Mixing Problem
Unbiased Spin Relaxation Dynamics
Discussion and concluding remarks
Detailed derivation and properties of blurred sampling
...and 14 more sections

Figures (7)

Figure 1: Statistical pathologies and their resolution via blurred sampling. (a) Nodal hypersurfaces lead to divergences in ratio-type estimators due to vanishing probability density $p({\boldsymbol{x}})$. (b) Blurred sampling locally perturbs the configurations, assigning finite probability to the original nodal set, regularizing the divergence. (c) The supports of ${\psi_{\boldsymbol{\theta}}}$ and $\hat{H} {\psi_{\boldsymbol{\theta}}}$ may not coincide (shaded region), resulting in a bias due to support mismatch, see Eq. (\ref{['eq:bias']}). (d) Blurred sampling exploits the connectivity of $\hat{H}$, allowing configurations in $\text{supp}({\psi_{\boldsymbol{\theta}}})$ to access the mismatched region of $\text{supp}(\hat{H} {\psi_{\boldsymbol{\theta}}})$, eliminating the bias.
Figure 2: Pedagogical example in the continuum. We consider two non-interacting spinless fermions on a ring, with the Hamiltonian $\hat{H} = -\frac{1}{2}(\partial^2_1+\partial^2_2)$, and variational ansatz ${\psi_{\boldsymbol{\theta}}}(x_1, x_2)=\cos(\theta) \sin(x_1-x_2) + \sin(\theta) \sin(2x_1-2x_2)$, $x_{1,2}\in[0,2\pi]$. (a) Distribution of the gradient estimator based on 1000 samples at $\theta=\pi/4$. Standard sampling exhibits a heavy-tailed distribution with exponent $\alpha = 1.5$, while blurred sampling yields a finite-variance estimator. (b) Estimated variance versus number of runs; Standard sampling does not converge.
Figure 3: Pedagogical example in discrete space. We consider a single-spin system with Hamiltonian $\hat{H} = X$ and ${\psi_{\boldsymbol{\theta}}}=(\cos(\theta), \sin(\theta))$. (a) Gradient estimator at different $\theta$ using 1000 samples. As $\theta \to 0$, standard sampling exhibits large fluctuations and develops a systematic bias, whereas blurred sampling remains stable. (b) Energy optimization via stochastic reconfiguration (SR), initialized at $\theta=\pi/3$. Standard sampling can become trapped near $\theta\to 0$, while blurred sampling follows the exact imaginary-time trajectory.
Figure 4: t-VMC results for two known benchmark problems. (a) We use the variational state $\ket{{\psi_{\boldsymbol{\theta}}}} =\alpha\ket{0} + \beta\ket{1}$, with $(\alpha,\beta)=(1,1)$ at $t=0$ and evolve it under $U(t)=\exp\{-itY\}$ with t-VMC. As shown in Ref. Filippo2023tVMC the standard Monte Carlo estimator collapses due to finite-support bias as the variational state gets close to $\ket{0}$, causing the dynamics to diverge. With blurred sampling, the correct dynamics is recovered. (b) A complex RBM with a single hidden unit fails to provide the correct dynamics under t-VMC for the $2\times2$ Heisenberg Hamiltonian vrcan2025instability, as the spin correlation diverges from the exact result using the standard sampler. The blurred sampling method produces the correct dynamics. In both cases, we use $q=0.5$ Hamiltonian-induced kernel for blurred sampling.
Figure 5: Accurate simulations of spin parity mixing with blurred sampling in t-VMC. (a) Time evolution of the parity operator for the quench of the $8$-spin state $\ket{\psi^\text{even}_{\boldsymbol{\theta}}}$ under the TFIM Hamiltonian at $J=h=1$. The variational state is an RBM with $32$ hidden units. Different lines indicate simulations with different values of the blur strength $q$. The inset shows that starting the t-VMC simulation with the standard estimator at $t_0=0$ leads to the immediate freezing of the dynamics due to the bias of the force. The dotted line shows the expected behavior of the parity for the standard $t=0$ dynamics. If we start the simulation at $t_0=0.05$, the standard estimator performs well for the initial ramp of the oscillation, but diverges for longer times. (b) The effective sample size $\mathrm{ESS} = \mathbb{E}[\omega({\boldsymbol{x}}')]^2/\mathbb{E}[\omega({\boldsymbol{x}}')^2]$. The dashed lines indicate the value $1-q$. At initialization, the $\mathrm{ESS}$ is exactly $1-q$. For all $q$, the $\mathrm{ESS}$ peaks whenever the parity goes through $0$, indicating that the state is balanced between odd and even parity sectors. (c) Time evolution of the parity operator using a Gaussian-state ansatz, with $n=64$. We show the slow ($h=1/8$) and fast ($h=1$) quench on the left and right, respectively. The dotted line indicates the expected freezing of the t-VMC dynamics with the standard estimator. Blurred sampling uses the Hamiltonian-induced kernel with $q=0.5$.
...and 2 more figures

Removing nodal and support-mismatch pathologies in Variational Monte Carlo via blurred sampling

Abstract

Removing nodal and support-mismatch pathologies in Variational Monte Carlo via blurred sampling

Authors

Abstract

Table of Contents

Figures (7)