Table of Contents
Fetching ...

Hierarchical autoregressive neural networks in three-dimensional statistical system

Piotr Białas, Vaibhav Chahar, Piotr Korcyl, Tomasz Stebel, Mateusz Winiarski, Dawid Zapolski

TL;DR

This work extends Hierarchical Autoregressive Networks (HAN) to three-dimensional Ising models to enable efficient neural sampling and direct probability evaluation, encapsulated by the autoregressive factorization $p(oldsymbol{s}) = p(s^1)igl(igl.igr) \,igl brace \

Abstract

Autoregressive Neural Networks (ANN) have been recently proposed as a mechanism to improve the efficiency of Monte Carlo algorithms for several spin systems. The idea relies on the fact that the total probability of a configuration can be factorized into conditional probabilities of each spin, which in turn can be approximated by a neural network. Once trained, the ANNs can be used to sample configurations from the approximated probability distribution and to explicitly evaluate this probability for a given configuration. It has also been observed that such conditional probabilities give access to information-theoretic observables such as mutual information or entanglement entropy. In this paper, we describe the hierarchical autoregressive network (HAN) algorithm in three spatial dimensions and study its performance using the example of the Ising model. We compare HAN with three other autoregressive architectures and the classical Wolff cluster algorithm. Finally, we provide estimates of thermodynamic observables for the three-dimensional Ising model, such as entropy and free energy, in a range of temperatures across the phase transition.

Hierarchical autoregressive neural networks in three-dimensional statistical system

TL;DR

This work extends Hierarchical Autoregressive Networks (HAN) to three-dimensional Ising models to enable efficient neural sampling and direct probability evaluation, encapsulated by the autoregressive factorization $p(oldsymbol{s}) = p(s^1)igl(igl.igr) \,igl brace \

Abstract

Autoregressive Neural Networks (ANN) have been recently proposed as a mechanism to improve the efficiency of Monte Carlo algorithms for several spin systems. The idea relies on the fact that the total probability of a configuration can be factorized into conditional probabilities of each spin, which in turn can be approximated by a neural network. Once trained, the ANNs can be used to sample configurations from the approximated probability distribution and to explicitly evaluate this probability for a given configuration. It has also been observed that such conditional probabilities give access to information-theoretic observables such as mutual information or entanglement entropy. In this paper, we describe the hierarchical autoregressive network (HAN) algorithm in three spatial dimensions and study its performance using the example of the Ising model. We compare HAN with three other autoregressive architectures and the classical Wolff cluster algorithm. Finally, we provide estimates of thermodynamic observables for the three-dimensional Ising model, such as entropy and free energy, in a range of temperatures across the phase transition.

Paper Structure

This paper contains 14 sections, 14 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Cube on the left part of the Figure: a hierarchical partitioning in 3D for $L=4$. The full system of $4\times4 \times4$ spins is divided into 3 subsets (denoted with the colors). Central and right part of the Figure: for better visibility of the 3D system, we also show four slices of the $4\times4 \times4$ cube. Spins generated with the networks at a given step of hierarchy are denoted with colors (see the text for details): red (first step and network), green (second step and network), blue (third step -- heat-bath algorithm).
  • Figure 2: A 3D kernel of PixelCNN. The kernel is divided into stacks. Elements multiplied by 0 are denoted with white color.
  • Figure 3: Left: schematic view of one gated PixellCNN layer. Right: architecture of gated PixelCNN network. See text for the description.
  • Figure 4: Mean time (in seconds) of 10 batches generation using the VAN (blue circles), HAN (orange circles), PixelCNN (red circles) and Gated PixelCNN (green circles) algorithms in dependence on system linear size $L$. Asymptotic approximations $\sim L^9$ (blue dashed line), $\sim L^6$ (orange dashed line) and $\sim L^6$ (green dashed line), for VAN, HAN and Gated PixelCNN accordingly, were superposed on plots. All the measurements were performed on Nvidia GeForce 4090 GPU.
  • Figure 5: Moving average of $ESS$ for different models as a function of epoch (left panel) and time (right panel) for $\beta = 0.22$, $L=8$. Curves with the same colors show different runs.
  • ...and 3 more figures