Table of Contents
Fetching ...

Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz

H. L. Dao

TL;DR

This work introduces the first non-Euclidean neural quantum state by employing a hyperbolic GRU within Variational Monte Carlo to approximate ground states of quantum many-body systems. It systematically benchmarks hyperbolic NQS against Euclidean NQS and DMRG across 1D TFIM, 2D TFIM, and 1D Heisenberg J1J2/J1J2J3 models, highlighting notable gains in systems with hierarchical interaction structures. The results suggest that hyperbolic geometry can enhance expressive power for NQS in settings with layered interactions, while also acknowledging increased training complexity. The study opens avenues for extending non-Euclidean NQS to other hyperbolic models and higher dimensions, potentially broadening the scope of quantum many-body variational ansatz design.

Abstract

In this work, we introduce the first type of non-Euclidean neural quantum state (NQS) ansatz, in the form of the hyperbolic GRU (a variant of recurrent neural networks (RNNs)), to be used in the Variational Monte Carlo method of approximating the ground state energy for quantum many-body systems. In particular, we examine the performances of NQS ansatzes constructed from both conventional or Euclidean RNN/GRU and from hyperbolic GRU in the prototypical settings of the one- and two-dimensional transverse field Ising models (TFIM) and the one-dimensional Heisenberg $J_1J_2$ and $J_1J_2J_3$ systems. By virtue of the fact that, for all of the experiments performed in this work, hyperbolic GRU can yield performances comparable to or better than Euclidean RNNs, which have been extensively studied in these settings in the literature, our work is a proof-of-concept for the viability of hyperbolic GRU as the first type of non-Euclidean NQS ansatz for quantum many-body systems. Furthermore, in settings where the Hamiltonian displays a clear hierarchical interaction structure, such as the 1D Heisenberg $J_1J_2$ & $J_1J_2J_3$ systems with the 1st, 2nd and even 3rd nearest neighbor interactions, our results show that hyperbolic GRU definitively outperforms its Euclidean version in almost all instances. The fact that these results are reminiscent of the established ones from natural language processing where hyperbolic GRU almost always outperforms Euclidean RNNs when the training data exhibit a tree-like or hierarchical structure leads us to hypothesize that hyperbolic GRU NQS ansatz would likely outperform Euclidean RNN/GRU NQS ansatz in quantum spin systems that involve different degrees of nearest neighbor interactions. Finally, with this work, we hope to initiate future studies of other types of non-Euclidean NQS beyond hyperbolic GRU.

Hyperbolic recurrent neural network as the first type of non-Euclidean neural quantum state ansatz

TL;DR

This work introduces the first non-Euclidean neural quantum state by employing a hyperbolic GRU within Variational Monte Carlo to approximate ground states of quantum many-body systems. It systematically benchmarks hyperbolic NQS against Euclidean NQS and DMRG across 1D TFIM, 2D TFIM, and 1D Heisenberg J1J2/J1J2J3 models, highlighting notable gains in systems with hierarchical interaction structures. The results suggest that hyperbolic geometry can enhance expressive power for NQS in settings with layered interactions, while also acknowledging increased training complexity. The study opens avenues for extending non-Euclidean NQS to other hyperbolic models and higher dimensions, potentially broadening the scope of quantum many-body variational ansatz design.

Abstract

In this work, we introduce the first type of non-Euclidean neural quantum state (NQS) ansatz, in the form of the hyperbolic GRU (a variant of recurrent neural networks (RNNs)), to be used in the Variational Monte Carlo method of approximating the ground state energy for quantum many-body systems. In particular, we examine the performances of NQS ansatzes constructed from both conventional or Euclidean RNN/GRU and from hyperbolic GRU in the prototypical settings of the one- and two-dimensional transverse field Ising models (TFIM) and the one-dimensional Heisenberg and systems. By virtue of the fact that, for all of the experiments performed in this work, hyperbolic GRU can yield performances comparable to or better than Euclidean RNNs, which have been extensively studied in these settings in the literature, our work is a proof-of-concept for the viability of hyperbolic GRU as the first type of non-Euclidean NQS ansatz for quantum many-body systems. Furthermore, in settings where the Hamiltonian displays a clear hierarchical interaction structure, such as the 1D Heisenberg & systems with the 1st, 2nd and even 3rd nearest neighbor interactions, our results show that hyperbolic GRU definitively outperforms its Euclidean version in almost all instances. The fact that these results are reminiscent of the established ones from natural language processing where hyperbolic GRU almost always outperforms Euclidean RNNs when the training data exhibit a tree-like or hierarchical structure leads us to hypothesize that hyperbolic GRU NQS ansatz would likely outperform Euclidean RNN/GRU NQS ansatz in quantum spin systems that involve different degrees of nearest neighbor interactions. Finally, with this work, we hope to initiate future studies of other types of non-Euclidean NQS beyond hyperbolic GRU.

Paper Structure

This paper contains 33 sections, 28 equations, 17 figures, 7 tables.

Figures (17)

  • Figure 1: Schematic of the process of calculating the RNN wavefunction $\Psi(\vec{{\sigma}})= \sqrt{P(\vec{{\sigma}})}|\vec{{\sigma}}\rangle$ from the probability $P(\vec{{\sigma}})$ of the sample $\vec{{\sigma}}$. Here $P(\vec{{\sigma}}) = P({\sigma}_1)P({\sigma}_2|{\sigma}_1)\ldots P({\sigma}_N|{\sigma}_{N-1})$. For a compact representation, $N=4$ in the schematic. The RNN in the diagram can be either Euclidean RNN/GRU or Hyperbolic GRU. Figure adapted from rnn_20.
  • Figure 2: The process of generating the length-$N$ samples $\vec{\sigma} = \left({\sigma}_1, \ldots, {\sigma}_{N}\right)$ using an RNN-based neural network. Each entry ${\sigma}_i$ of $\vec{{\sigma}}$ is sampled from the output $y_i$ (generated by the RNN-Dense network using as input the previous entry ${\sigma}_{i-1}$ and the previous RNN hidden state $h_{i-1}$) and then one-hot encoded to be used as input, together with the previous hidden state $h_{i-1}$, for the generation of the next entry ${\sigma}_{i+1}$. For $i=0$, the input $\vec{{\sigma}}_0$ and first hidden state $h_0$ are initialized to zero. Figure adapted from rnn_20.
  • Figure 3: Schematic of the calculation of the RNN wavefunction $\Psi(\vec{{\sigma}}) = \sum_{\vec{{\sigma}}}\exp(i\phi(\vec{{\sigma}}) \sqrt{P(\vec{{\sigma}})}|\vec{{\sigma}}\rangle$ where the amplitude $P(\vec{{\sigma}}) = P({\sigma}_1)P({\sigma}_2|{\sigma}_1)\ldots P(s_N|{\sigma}_{N-1})$ and the phase $\phi(\vec{{\sigma}}) = \sum_{i=1}^N \phi(\vec{{\sigma}}_i)$. For ease of illustration, $N$ is chosen to be 3. (S) and (SS), correspondingly, denote the dense layer with the Softmax and Softsign activation function. Figure adapted from rnn_20.
  • Figure 4: VMC experiments for 1D TFIM with $N=20,40, 80$ and 100 spins (from top to bottom row, left to right): Comparisons of the performances of the variants eRNN-50-s50, eGRU-50-s50, hypGRU-50-s50 (listed in Table \ref{['tab:1dtfim_setting']}) - abbreviated as eRNN, eGRU and hGRU on the horizontal axis of each subfigure. In each subfigure, the mean energy for each NQS ansatz is shown as a dot with an error bar (representing the standard error).
  • Figure 5: Comparisons of the performances of 1D Euclidean GRU and 1D hyperbolic GRU ansatzes listed in Table \ref{['tab:2d_tfim_setting']} for the 2D TFIM with different ($N_x, N_y$) lattices - clockwise from top: $(N_x, N_y) = (5,5), (7,7), (9,9), (8,8)$. Each dot with error bar represents the mean $E$ value of an NQS ansatz. For all cases, 2D Euclidean RNN is the best ansatz and is not included in this comparison, which strictly measures the performances of the 1D Euclidean versus 1D hyperbolic GRU. The second best ansatz is always 1D hyperbolic GRU for all $(N_x, N_y)$.
  • ...and 12 more figures