Table of Contents
Fetching ...

Addressing the Infinite Variance Problem in Fermionic Monte Carlo Simulations: Retrospective Error Remediation and the Exact Bridge Link Method

Zhou-Quan Wan, Shiwei Zhang

TL;DR

The paper addresses the infinite variance problem in fermionic determinantal quantum Monte Carlo (DQMC) caused by zeros in the sampling weight, which leads to heavy-tailed distributions and unreliable error bars even when the sign problem is absent. It introduces two complementary solutions: retrospective tail-aware error remediation to produce robust confidence intervals, and the exact bridge link method, a minimal-overhead modification that eliminates infinite variance without bias. Analysis shows local observables in Hubbard-like models exhibit heavy tails with a characteristic exponent near $\alpha\approx 1.5$, and the exact bridge link method removes the variance divergence while preserving sign-free sampling. The authors demonstrate the approach on the attractive SU(4) Hubbard model, computing charge-4e correlations with dramatically reduced statistical noise and uncovering distinct tail behavior that enables access to previously inaccessible observables, thereby improving reliability for benchmarking and fundamental studies in fermionic simulations.

Abstract

We revisit the infinite variance problem in fermionic Monte Carlo simulations, which is widely encountered in areas ranging from condensed matter to nuclear and high-energy physics. The different algorithms, which we broadly refer to as determinantal quantum Monte Carlo (DQMC), are applied in many situations and differ in details, but they share a foundation in field theory, and often involve fermion determinants whose symmetry properties make the algorithm sign-problem-free. We show that the infinite variance problem arises as the observables computed in DQMC tend to form heavy-tailed distributions. To remedy this issue retrospectively, we introduce a tail-aware error estimation method to correct the otherwise unreliable estimates of confidence intervals. Furthermore, we demonstrate how to perform DQMC calculations that eliminate the infinite variance problem for a broad class of observables. Our approach is an exact bridge link method, which involves a simple and efficient modification to the standard DQMC algorithm. The method introduces no systematic bias and is straightforward to implement with minimal computational overhead. Our results establish a practical and robust solution to the infinite variance problem, with broad implications for improving the reliability of a variety of fundamental fermion simulations.

Addressing the Infinite Variance Problem in Fermionic Monte Carlo Simulations: Retrospective Error Remediation and the Exact Bridge Link Method

TL;DR

The paper addresses the infinite variance problem in fermionic determinantal quantum Monte Carlo (DQMC) caused by zeros in the sampling weight, which leads to heavy-tailed distributions and unreliable error bars even when the sign problem is absent. It introduces two complementary solutions: retrospective tail-aware error remediation to produce robust confidence intervals, and the exact bridge link method, a minimal-overhead modification that eliminates infinite variance without bias. Analysis shows local observables in Hubbard-like models exhibit heavy tails with a characteristic exponent near , and the exact bridge link method removes the variance divergence while preserving sign-free sampling. The authors demonstrate the approach on the attractive SU(4) Hubbard model, computing charge-4e correlations with dramatically reduced statistical noise and uncovering distinct tail behavior that enables access to previously inaccessible observables, thereby improving reliability for benchmarking and fundamental studies in fermionic simulations.

Abstract

We revisit the infinite variance problem in fermionic Monte Carlo simulations, which is widely encountered in areas ranging from condensed matter to nuclear and high-energy physics. The different algorithms, which we broadly refer to as determinantal quantum Monte Carlo (DQMC), are applied in many situations and differ in details, but they share a foundation in field theory, and often involve fermion determinants whose symmetry properties make the algorithm sign-problem-free. We show that the infinite variance problem arises as the observables computed in DQMC tend to form heavy-tailed distributions. To remedy this issue retrospectively, we introduce a tail-aware error estimation method to correct the otherwise unreliable estimates of confidence intervals. Furthermore, we demonstrate how to perform DQMC calculations that eliminate the infinite variance problem for a broad class of observables. Our approach is an exact bridge link method, which involves a simple and efficient modification to the standard DQMC algorithm. The method introduces no systematic bias and is straightforward to implement with minimal computational overhead. Our results establish a practical and robust solution to the infinite variance problem, with broad implications for improving the reliability of a variety of fundamental fermion simulations.

Paper Structure

This paper contains 12 sections, 2 theorems, 34 equations, 8 figures.

Key Result

Theorem 1

Let $X_1, X_2,\dots$ be i.i.d. copies of $X$ where $X$ has characteristic function $\Phi_X(u)$ and satisfies the tail conditions: where $c^-\geq 0, c^+\geq 0$ and $0<c^-+c^+<\infty$. Then the normalized sum with $Z\sim\mathbf{S}(\mathrm{min}(2,\alpha),\beta,1,0)$ and

Figures (8)

  • Figure 1: (a) Energy estimates from 50 independent DQMC runs. Error bars indicate twice the SEM, corresponding to a conventional $\sim$95.4% confidence interval. Orange points denote cases where the true energy value (black horizontal line, obtained using the exact bridge link method described in Sec. \ref{['sec:eliminating_ivp']}) lies outside the estimated confidence interval. (b) Variance of the energy estimator versus sample size for the first 5 independent runs, with each color indicating one run.
  • Figure 2: Characteristics of the probability distribution from DQMC data. (a) Probability density of raw energy measurements taken at the middle time slice of each sweep, from the dataset in Fig. \ref{['fig:energy_without_bridge']}(a). The purple curve shows a normal distribution $\mathcal{N}(\bar{E}, \sigma )$ for comparison, with $\sigma = \sqrt\frac{\pi}{2}\cdot\overline{|E-\bar{E}|}$, chosen to match the mean absolute deviation of the data since the variance diverges in this case. The dashed orange line is the fitted power-law model from panel (b). (b) Tail distribution of the energy data, which exhibits a power-law decay. The dashed orange line displays the result from the fit using Eq. (\ref{['eq:power-fit-tail']}). (c) The probability density of bin-average data of energy for different bin size $N_{\text{bin}}$. The spread of $\bar{E}_{N_\text{bin}}$ appears to decrease as $N_{\text{bin}}$ is increased, but not as rapidly as $1/\sqrt{N_{\text{bin}}}$, and the distribution does not approach a Gaussian. (d) Data collapse of all $\bar{E}_{N_\text{bin}}$ data onto a stable distribution $\mathbf{S}(X;\alpha, 1, \gamma, 0)$. The result of the fit using this form, shown by the blue dashed line, is consistent with that from the fit in panel (b).
  • Figure 3: Error remediation incorporating tail information. (a) Tail-aware error estimation for the same dataset shown in Fig. \ref{['fig:energy_without_bridge']}(a). The confidence interval is constructed by rescaling the original SEM estimate, $\text{SE}_N$. The interval is given by $(\bar{E} - \xi_\downarrow\, \text{SE}_N, \bar{E} + \xi_\uparrow\, \text{SE}_N)$, with $\xi_\uparrow = 1.22$ and $\xi_\downarrow = 3.90$ which are obtained following the method outlined in the text. Out of the 50 data points, 2 (indicated in orange) are outside the corrected interval, consistent with the target confidence level. (b) The scaling factors $\xi_\uparrow$ (dashed lines) and $\xi_\downarrow$ (solid lines) as functions of the characteristic exponent $\alpha$, corresponding to the $2\sigma$ confidence level. The symbols show the final $\xi_\uparrow$ and $\xi_\downarrow$ values used in panel (a).
  • Figure 4: Nodal structure of the spin-resolved weight and local energy distribution. Panels (a) and (b) show ${\omega}_{\uparrow}(\boldsymbol{x})$ as colormaps in two planar cuts in the space of auxiliary-field configurations. A configuration is first identified by detecting a sign change during the simulation. The last update causing the sign change occurred with the particular field $x_i$, which is shown as the horizontal axis in both panels ($\Delta x_i$ represents the displacement from the node position). In (a), the plane is $\Delta x_i$-$\Delta x_j$, where site $j$ is a near-neighbor of $i$. In (b), the plane is $\Delta x_i$-$\Delta x_k$, where $k$ is the furthest site from $i$. Each inset shows ${\omega}_{\uparrow}(\boldsymbol{x})$ plotted along the black dashed line indicated in the main plot. Panel (c) shows the distribution of local energies for four interaction strengths. Dashed lines represent power-law fits of the form $y=\frac{\alpha c}{(x_0-x)^{\alpha+1}}$. The resulting characteristic exponents are shown in the figure and the tail coefficients are $0.7\pm0.4,4.1\pm1.5,15.4\pm3.5,(1.8\pm0.4)\times 10^2$ for $U=-2,-3,-4,-8$, respectively. The system is a $6\times 6\times 2$ honeycomb lattice Hubbard model at half filling. The site $i$ is $(5,5,A)$, while $j=(5,4,B)$ and $k=(2,2,A)$ ($A$ and $B$ are sublattice indices). Each histogram in (c) is from $10^7$ data points. Each ${\omega}_{\uparrow}(\boldsymbol{x})$ is normalized by its maximum value within the plotted region in (a) and (b).
  • Figure 5: Illustrative results from the exact bridge link method. (a) DQMC results for the same system as in Fig. \ref{['fig:energy_without_bridge']}(a). Note that the vertical axis scale has been shrunk by more than $15\times$ from Fig. \ref{['fig:energy_without_bridge']}(a). Identical simulation parameters are used, except that the bridge link method is employed with bridge operator parameter $\theta = 1$ (Eq. (\ref{['bridge_op']})). Error bars are estimated using the Jackknife method. (b) Scatter plot of $|E_{\text{loc}}|$ versus $\Lambda_{\text{loc}}$. While both quantities can diverge in magnitude, their ratio converges to a constant, indicated by the dashed line. (c),(d) Probability densities of the ratio $E_{\text{loc}}/\Lambda_{\text{loc}}$ and of $\Lambda_{\text{loc}}^{-1}$, respectively.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Definition 1
  • Definition 2
  • Theorem 1: Generalized Central Limit Theorem
  • Corollary 1.1