Table of Contents
Fetching ...

Supplementing Recurrent Neural Network Wave Functions with Symmetry and Annealing to Improve Accuracy

Mohamed Hibat-Allah, Roger G. Melko, Juan Carrasquilla

TL;DR

The paper tackles obtaining accurate ground states for 2D Heisenberg models, including frustrated lattices with sign problems, by developing a complex 2D RNN wave-function with autoregressive sampling. It introduces symmetry-enforced variational Monte Carlo optimization and a variational annealing scheme that augments energy with a pseudo-entropy to form a variational free energy $F_{\bm{\theta}}(n)$ and employs a cooling schedule $T(n)$ to navigate rugged landscapes. Empirical results show substantial improvements on square and triangular lattices: symmetry constraints yield tighter energies on square lattices, while annealing enables accurate energies on large triangular lattices (up to $16\times16$), surpassing DMRG for sizes $\ge 14\times14$ with far fewer parameters. The approach offers a flexible, scalable variational framework for studying frustrated quantum many-body systems and could enhance our ability to explore larger 2D spin systems.

Abstract

Recurrent neural networks (RNNs) are a class of neural networks that have emerged from the paradigm of artificial intelligence and has enabled lots of interesting advances in the field of natural language processing. Interestingly, these architectures were shown to be powerful ansatze to approximate the ground state of quantum systems. Here, we build over the results of [Phys. Rev. Research 2, 023358 (2020)] and construct a more powerful RNN wave function ansatz in two dimensions. We use symmetry and annealing to obtain accurate estimates of ground state energies of the two-dimensional (2D) Heisenberg model, on the square lattice and on the triangular lattice. We show that our method is superior to Density Matrix Renormalisation Group (DMRG) for system sizes larger than or equal to $14 \times 14$ on the triangular lattice.

Supplementing Recurrent Neural Network Wave Functions with Symmetry and Annealing to Improve Accuracy

TL;DR

The paper tackles obtaining accurate ground states for 2D Heisenberg models, including frustrated lattices with sign problems, by developing a complex 2D RNN wave-function with autoregressive sampling. It introduces symmetry-enforced variational Monte Carlo optimization and a variational annealing scheme that augments energy with a pseudo-entropy to form a variational free energy and employs a cooling schedule to navigate rugged landscapes. Empirical results show substantial improvements on square and triangular lattices: symmetry constraints yield tighter energies on square lattices, while annealing enables accurate energies on large triangular lattices (up to ), surpassing DMRG for sizes with far fewer parameters. The approach offers a flexible, scalable variational framework for studying frustrated quantum many-body systems and could enhance our ability to explore larger 2D spin systems.

Abstract

Recurrent neural networks (RNNs) are a class of neural networks that have emerged from the paradigm of artificial intelligence and has enabled lots of interesting advances in the field of natural language processing. Interestingly, these architectures were shown to be powerful ansatze to approximate the ground state of quantum systems. Here, we build over the results of [Phys. Rev. Research 2, 023358 (2020)] and construct a more powerful RNN wave function ansatz in two dimensions. We use symmetry and annealing to obtain accurate estimates of ground state energies of the two-dimensional (2D) Heisenberg model, on the square lattice and on the triangular lattice. We show that our method is superior to Density Matrix Renormalisation Group (DMRG) for system sizes larger than or equal to on the triangular lattice.
Paper Structure (10 sections, 7 equations, 4 figures, 1 table)

This paper contains 10 sections, 7 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: (a) A graphical illustration of a 2D RNN. Each RNN cell receives two hidden states $\bm{h}_{i,j-1}$ and $\bm{h}_{i-1,j}$, as well as two input vectors $\bm{\sigma}_{i,j-1}$ and $\bm{\sigma}_{i-1,j}$ (not shown) as illustrated by the black arrows. The green arrows correspond to the zigzag path we use for 2D autoregressive sampling. The initial memory state $\bm{h}_0$ of the RNN and the initial inputs $\bm{\sigma}_0$ (not shown) are null vectors. (b) A flowchart describing our annealing implementation.
  • Figure 2: (a) A plot of the relative error $\epsilon$ after applying different symmetries of the Heisenberg model on the square lattice with size $6 \times 6$. The relative error is computed with respect to the DMRG energy roth2020iterative. $C_4$ is the point group of four rotations. (b) A comparison of the energy per site obtained with our 2DRNN ansatz and PEPS Liu_2017, PixelCNN Sharir_2020 and QMC Liu_2017 on the Heisenberg model on the square lattice with size $10 \times 10$.
  • Figure 3: (a) A scaling of the relative error $\epsilon$ against the number of annealing steps $N_{\rm annealing}$ for the triangular Heisenberg model with size $4\times4$, 'VMC' corresponds to an initial pseudo-temperature $T_0 = 0$ whereas for 'VMC with annealing', we start with $T_0 = 1$. (b) A plot of the energy difference per site between the 2DRNN and the DMRG. Negative values show that our ansatz is superior compared to DMRG for system sizes larger than $14 \times 14$.
  • Figure 4: (a) A plot of the energy variance per spin $\sigma^2$ against the number of gradient descent steps, for both a 2D non-gated RNN and a 2D gated RNN. Here we choose the Heisenberg model on a square lattice with size $6\times6$ as a test bed. (b) A scaling of the relative error against the number of annealing steps $N_{\rm annealing}$ for the square Heisenberg model with size $4\times4$, 'VMC' corresponds to an initial pseudo-temperature $T_0 = 0$, whereas for 'VMC with annealing' we start with $T_0 = 1$.