Table of Contents
Fetching ...

Accelerating ground-state auxiliary-field quantum Monte Carlo simulations by delayed update and block force-bias update

Hao Du, Yuan-Yao He

TL;DR

This work targets the dominant cost in ground-state AFQMC simulations—the local update of auxiliary fields—by introducing two acceleration strategies that preserve the existing scaling while reducing computational prefactors. The delayed update replaces many vector outer products with a single matrix-matrix multiplication, controlled by a delay rank $n_d$, while the block force-bias update partitions a time slice into blocks of size $n_b$ to keep the acceptance reasonable at higher fillings. Through extensive PQMC tests on the 2D Hubbard model and SOC-Hubbard model, the authors demonstrate substantial speedups (up to around 8x) for systems with thousands of lattice sites and provide guidance on how to tune $n_d$ and $n_b$ across parameter regimes. They further discuss applications to ground-state CPQMC and present an application diagram illustrating where each update scheme offers the greatest efficiency, underscoring the broad potential impact for scalable, unbiased simulations of general correlated fermion systems.

Abstract

Ground-state auxiliary-field quantum Monte Carlo (AFQMC) methods have become key numerical tools for studying quantum phases and phase transitions in interacting many-fermion systems. Despite the broad applicability, the efficiency of these algorithms is often limited by the bottleneck associated with the {\it local update} of the field configuration. In this work, we propose two novel update schemes, the {\it delayed update} and {\it block force-bias update}, both of which can generally and efficiently accelerate ground-state AFQMC simulations. The {\it delayed update}, with a predetermined delay rank, is an elegantly improved version of the {\it local update}, accelerating the process by replacing multiple vector-vector outer products in the latter with a single matrix-matrix multiplication. The {\it block force-bias update} is a block variant of the conventional force-bias update, which is a highly efficient scheme for dilute systems but suffers from the low acceptance ratio in lattice models. Our modified scheme maintains the high efficiency while offering flexible tuning of the acceptance ratio, controlled by the block size, for any desired fermion filling. We apply these two update schemes to both the standard and spin-orbit coupled two-dimensional Hubbard models, demonstrating their speedup over the {\it local update} with respect to the delay rank and block size. We also explore their efficiencies across varying system sizes and model parameters. Our results identify a speedup of $\sim$$8$ for systems with $\sim$$1600$ lattice sites. Furthermore, we have investigated the broader applications as well as an application diagram of these update schemes to general correlated fermion systems.

Accelerating ground-state auxiliary-field quantum Monte Carlo simulations by delayed update and block force-bias update

TL;DR

This work targets the dominant cost in ground-state AFQMC simulations—the local update of auxiliary fields—by introducing two acceleration strategies that preserve the existing scaling while reducing computational prefactors. The delayed update replaces many vector outer products with a single matrix-matrix multiplication, controlled by a delay rank , while the block force-bias update partitions a time slice into blocks of size to keep the acceptance reasonable at higher fillings. Through extensive PQMC tests on the 2D Hubbard model and SOC-Hubbard model, the authors demonstrate substantial speedups (up to around 8x) for systems with thousands of lattice sites and provide guidance on how to tune and across parameter regimes. They further discuss applications to ground-state CPQMC and present an application diagram illustrating where each update scheme offers the greatest efficiency, underscoring the broad potential impact for scalable, unbiased simulations of general correlated fermion systems.

Abstract

Ground-state auxiliary-field quantum Monte Carlo (AFQMC) methods have become key numerical tools for studying quantum phases and phase transitions in interacting many-fermion systems. Despite the broad applicability, the efficiency of these algorithms is often limited by the bottleneck associated with the {\it local update} of the field configuration. In this work, we propose two novel update schemes, the {\it delayed update} and {\it block force-bias update}, both of which can generally and efficiently accelerate ground-state AFQMC simulations. The {\it delayed update}, with a predetermined delay rank, is an elegantly improved version of the {\it local update}, accelerating the process by replacing multiple vector-vector outer products in the latter with a single matrix-matrix multiplication. The {\it block force-bias update} is a block variant of the conventional force-bias update, which is a highly efficient scheme for dilute systems but suffers from the low acceptance ratio in lattice models. Our modified scheme maintains the high efficiency while offering flexible tuning of the acceptance ratio, controlled by the block size, for any desired fermion filling. We apply these two update schemes to both the standard and spin-orbit coupled two-dimensional Hubbard models, demonstrating their speedup over the {\it local update} with respect to the delay rank and block size. We also explore their efficiencies across varying system sizes and model parameters. Our results identify a speedup of for systems with lattice sites. Furthermore, we have investigated the broader applications as well as an application diagram of these update schemes to general correlated fermion systems.

Paper Structure

This paper contains 25 sections, 67 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: The average time for update per sweep (in seconds) using the delayed update as a function of $n_d/L$ (with $n_d$ as the predetermined delay rank), in PQMC simulations of the 2D Hubbard model (\ref{['eq:2DHubbard']}) with (a) $U/t=-4,n=1.0$, and (b) $U/t=-6,n=0.625$. The time data are rescaled by factors $\times 5,\times 1/3,\times 1/25,\times 1/100$ for $L=8,16,24,32$ in (a), and by $\times 10,\times 1/8,\times 1/35$ for $L=8,24,32$ in (b). The leftmost data points (solid symbols) at $n_d/L=0$ are results from the conventional local update scheme. The insets in both panels illustrate corresponding speedups achieved by the delayed update compared to the local update. The timing data are summarized in Tables \ref{['Table:A1']} and \ref{['Table:A2']} in Appendix \ref{['sec:AppendixE']}.
  • Figure 2: Comparison of the average time for update per sweep (in seconds) between the local and delayed updates, as a function of $L$ in PQMC simulations of the 2D Hubbard model (\ref{['eq:2DHubbard']}) with (a) $U/t=-1,n=1.0$, and (b) $U/t=-1,n=0.5$ (with simulation parameters $2\Theta t=40$ and $\Delta\tau t=0.10$ for both cases). The values of $n_d/L=1.0$ and $n_d/L=0.5$ are used in (a) and (b), respectively. Blue dashed lines plot the theoretical computational complexity ($\propto L^6$), while green dashed lines show algebraic fits ($\propto L^{\alpha}$) to the consumed time of delayed update, yielding fitted exponents $\alpha$ slightly below $6$. The insets in both panels illustrate corresponding speedups achieved by the delayed update compared to the local update.
  • Figure 3: Comparisons of (a) the average time for update per sweep (in seconds), and (b) the acceptance ratio, between the local update and full force-bias update in PQMC simulations of the 2D interacting Fermi gas in the crossover regime [$\log(k_Fa)=+0.50$]. Results are shown for two cases, $N_e=10$ and $N_e=58$. (a), (b) Show log-log plot and semilog plot, respectively. The dashed lines in (a) represent algebraic fits ($\propto L^{\alpha}$) to the consumed time, yielding fitted exponents $\alpha$ all close to 2 as expected. The inset of (a) illustrate corresponding speedups achieved by the full force-bias update compared to the local update.
  • Figure 4: Comparisons of the average time for update per sweep (in seconds) and the acceptance ratio between the local update and full force-bias update in PQMC simulations of the 2D Hubbard model (\ref{['eq:2DHubbard']}). (a), (b) Show results for the model with $U/t=-4,n=1.0$ (with parameters $2\Theta t$ and $\Delta\tau t$ included), while (c) and (d) are for $U/t=-6,n=0.625$. In (a) and (c), blue dashed lines plot the theoretical scaling ($\propto L^{6}$), and green dashed lines show the algebraic fits ($\propto L^{\alpha}$) to the consumed time of full force-bias update, yielding fitted exponents $\alpha$ slightly below $6$. Insets in (a) and (c) illustrate corresponding speedups achieved by the full force-bias update compared to the local update.
  • Figure 5: The log-log plot of $(1-R)$ (with $R$ denoting the acceptance ratio) versus $\Delta\tau t$ for the full force-bias update in PQMC simulations of the 2D Hubbard model (\ref{['eq:2DHubbard']}) with $U/t=-4,n=1.0$. Results are shown for $L=4,8,16$. The dashed lines represent algebraic fits as $(1-R)\propto(\Delta\tau t)^{\eta}$, with fitted exponents $\eta$ close to 1, suggesting a linear dependence.
  • ...and 5 more figures