Table of Contents
Fetching ...

Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the square lattice

M. Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, Roger G. Melko

TL;DR

This work develops and benchmarks a two-dimensional recurrent neural network variational wavefunction for the square-lattice Heisenberg antiferromagnet, enabling ground-state studies on lattices with more than 1000 spins via iterative retraining. By reusing trained weights across increasing system sizes and enforcing symmetries, the authors perform finite-size scaling to extract thermodynamic-limit observables such as the ground-state energy and sublattice magnetization, achieving close agreement with quantum Monte Carlo and PEPS benchmarks. The study also compares energies and correlation observables, showing that stronger variational energies do not always guarantee improved correlations and highlighting the need to assess both energies and correlations in variational studies. The results demonstrate that RNN-based variational states can capture key ground-state properties in the thermodynamic limit and suggest directions for improving expressiveness and exploring other lattices with scalable train-time strategies.

Abstract

Machine-learning-based variational Monte Carlo simulations are a promising approach for targeting quantum many-body ground states, especially in two dimensions and in cases where the ground state is known to have a non-trivial sign structure. While many state-of-the-art variational energies have been reached with these methods for finite-size systems, little work has been done to use these results to extract information about the target state in the thermodynamic limit. In this work, we employ recurrent neural networks (RNNs) as a variational ansätze, and leverage their recurrent nature to simulate the ground states of progressively larger systems through iterative retraining. This transfer learning technique allows us to simulate spin-$\frac{1}{2}$ systems on lattices with more than 1,000 spins without beginning optimization from scratch for each system size, thus reducing the demands for computational resources. In this study, we focus on the square-lattice antiferromagnetic Heisenberg model, where it is possible to carefully benchmark our results. We show that we are able to systematically improve the accuracy of the results from our simulations by increasing the training time, and obtain results for finite-sized lattices that are in good agreement with the literature values. Furthermore, we use these results to extract accurate estimates of the ground-state properties in the thermodynamic limit. This work demonstrates that RNN wavefunctions are able to extract accurate information about quantum many-body systems in the thermodynamic limit.

Leveraging recurrence in neural network wavefunctions for large-scale simulations of Heisenberg antiferromagnets on the square lattice

TL;DR

This work develops and benchmarks a two-dimensional recurrent neural network variational wavefunction for the square-lattice Heisenberg antiferromagnet, enabling ground-state studies on lattices with more than 1000 spins via iterative retraining. By reusing trained weights across increasing system sizes and enforcing symmetries, the authors perform finite-size scaling to extract thermodynamic-limit observables such as the ground-state energy and sublattice magnetization, achieving close agreement with quantum Monte Carlo and PEPS benchmarks. The study also compares energies and correlation observables, showing that stronger variational energies do not always guarantee improved correlations and highlighting the need to assess both energies and correlations in variational studies. The results demonstrate that RNN-based variational states can capture key ground-state properties in the thermodynamic limit and suggest directions for improving expressiveness and exploring other lattices with scalable train-time strategies.

Abstract

Machine-learning-based variational Monte Carlo simulations are a promising approach for targeting quantum many-body ground states, especially in two dimensions and in cases where the ground state is known to have a non-trivial sign structure. While many state-of-the-art variational energies have been reached with these methods for finite-size systems, little work has been done to use these results to extract information about the target state in the thermodynamic limit. In this work, we employ recurrent neural networks (RNNs) as a variational ansätze, and leverage their recurrent nature to simulate the ground states of progressively larger systems through iterative retraining. This transfer learning technique allows us to simulate spin- systems on lattices with more than 1,000 spins without beginning optimization from scratch for each system size, thus reducing the demands for computational resources. In this study, we focus on the square-lattice antiferromagnetic Heisenberg model, where it is possible to carefully benchmark our results. We show that we are able to systematically improve the accuracy of the results from our simulations by increasing the training time, and obtain results for finite-sized lattices that are in good agreement with the literature values. Furthermore, we use these results to extract accurate estimates of the ground-state properties in the thermodynamic limit. This work demonstrates that RNN wavefunctions are able to extract accurate information about quantum many-body systems in the thermodynamic limit.

Paper Structure

This paper contains 22 sections, 20 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: A 2D RNN wavefunction defined for a square lattice with $L=4$. The autoregressive sequence is defined by the red arrows. Sampling and inference are performed along this path. The information in the network, stored in the hidden vectors, is passed in two directions, along the black arrows. The black dotted arrows show how pseudo-periodic boundary connections can be built in to the RNN wavefunction. Both the two-dimensional information passing and the pseudo-periodic boundary connections are implemented in a causal way such that the autoregressive sequence is not violated.
  • Figure 2: The number of training steps used in the optimization for each system size as determined by our parameterized training schedule defined in \ref{['eq:schedule']}. We consider five different scales and two rates. The colors and markers are used to indicate these two parameters respectively. Throughout this work when results from all of our simulations are presented, this legend will be used. Note that when $s=1$ and $r=0.475$, our training schedule is similar to that which has been used in other iterative retraining studies roth_iterative_2020hibat-allah_supplementing_2024.
  • Figure 3: (a) The final variational energies obtained from all optimized RNN wavefunctions for each system size plotted according to the known scaling form of the energy, $1/L^3$neuberger_finite-size_1989hasenfratz_finite_1993. All of the final variational energies were estimated with $10\times10^3$ samples. Reference energies for each $L$anders_unpublished and the thermodynamic limit (TL) value of the ground-state energy per spin sandvik_finite-size_1997 from QMC is shown for reference. The inset shows a zoomed view of the region close to zero plotted on a log scale for easier viewing. (b) The variances of the above final variational energies as a function of the number of spins in the system $N$.
  • Figure 4: Energy estimates obtained from extrapolating the variational energies to their zero-variance value compared to reference energies obtained from QMC simulations anders_unpublished. The thermodynamic limit (TL) value of the ground-state energy per spin from QMC sandvik_finite-size_1997 is shown for reference. The inset shows a zoomed view the region close to zero plotted on a log scale for easier viewing.
  • Figure 5: Estimates of the squared sublattice magnetization scaled as a function of $1/L$. Estimates of $M^2$ and $M_C^2$ are shown to demonstrate that they should extrapolate to the same value in the thermodynamic limit. The inset shows a zoomed view of the $y$ intercept. Closed markers represent the values of $M^2$ and $M_C^2$ estimated using the correlations defined by \ref{['eq:real_space_corr']}, while open markers represent the values of $M^2$ and $M^2_C$ that are estimated using the correlations defined by \ref{['eq:real_space_corr_z']}. Finite-size estimates and the thermodynamic limit (TL) value of the squared sublattice magnetization from QMC sandvik_loop_2010 are shown for reference.
  • ...and 9 more figures