Table of Contents
Fetching ...

Variational Neural Annealing

Mohamed Hibat-Allah, Estelle M. Inack, Roeland Wiersema, Roger G. Melko, Juan Carrasquilla

TL;DR

The paper tackles NP-hard optimization problems by reframing them as ground-state searches of Ising Hamiltonians. It proposes variational neural annealing, integrating autoregressive RNNS with variational principles to anneal either a classical distribution p_lambda (VCA) or a quantum wavefunction |Psi_lambda> (VQA) while progressively lowering fluctuations. Across random 1D chains, the Edwards-Anderson model, and fully-connected spin glasses (SK and WPE), VCA generally outperforms SA, SQA, and even VQA, especially for long annealing, aided by exact autoregressive sampling that enables efficient entropy and gradient estimation. The approach highlights architecture-aware optimization paths, with potential extensions via reinforcement learning to tune schedules and problem-specific models for scalable optimization on rough landscapes.

Abstract

Many important challenges in science and technology can be cast as optimization problems. When viewed in a statistical physics framework, these can be tackled by simulated annealing, where a gradual cooling procedure helps search for groundstate solutions of a target Hamiltonian. While powerful, simulated annealing is known to have prohibitively slow sampling dynamics when the optimization landscape is rough or glassy. Here we show that by generalizing the target distribution with a parameterized model, an analogous annealing framework based on the variational principle can be used to search for groundstate solutions. Modern autoregressive models such as recurrent neural networks provide ideal parameterizations since they can be exactly sampled without slow dynamics even when the model encodes a rough landscape. We implement this procedure in the classical and quantum settings on several prototypical spin glass Hamiltonians, and find that it significantly outperforms traditional simulated annealing in the asymptotic limit, illustrating the potential power of this yet unexplored route to optimization.

Variational Neural Annealing

TL;DR

The paper tackles NP-hard optimization problems by reframing them as ground-state searches of Ising Hamiltonians. It proposes variational neural annealing, integrating autoregressive RNNS with variational principles to anneal either a classical distribution p_lambda (VCA) or a quantum wavefunction |Psi_lambda> (VQA) while progressively lowering fluctuations. Across random 1D chains, the Edwards-Anderson model, and fully-connected spin glasses (SK and WPE), VCA generally outperforms SA, SQA, and even VQA, especially for long annealing, aided by exact autoregressive sampling that enables efficient entropy and gradient estimation. The approach highlights architecture-aware optimization paths, with potential extensions via reinforcement learning to tune schedules and problem-specific models for scalable optimization on rough landscapes.

Abstract

Many important challenges in science and technology can be cast as optimization problems. When viewed in a statistical physics framework, these can be tackled by simulated annealing, where a gradual cooling procedure helps search for groundstate solutions of a target Hamiltonian. While powerful, simulated annealing is known to have prohibitively slow sampling dynamics when the optimization landscape is rough or glassy. Here we show that by generalizing the target distribution with a parameterized model, an analogous annealing framework based on the variational principle can be used to search for groundstate solutions. Modern autoregressive models such as recurrent neural networks provide ideal parameterizations since they can be exactly sampled without slow dynamics even when the model encodes a rough landscape. We implement this procedure in the classical and quantum settings on several prototypical spin glass Hamiltonians, and find that it significantly outperforms traditional simulated annealing in the asymptotic limit, illustrating the potential power of this yet unexplored route to optimization.

Paper Structure

This paper contains 16 sections, 72 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Schematic illustration of the space of probability distributions visited during simulated annealing. An arbitrarily slow SA visits a series of Boltzmann distributions starting at the high temperature (e.g. $T=\infty$) and ending in the $T=0$ Boltzmann distribution (continuous yellow line), where a perfect solution to an optimization problem is reached. These solutions are found either at the edge or a corner (for non-degenerate problems) of the standard probabilistic simplex (colored triangle plane). A practical, finite-time SA trajectory (red dotted line), as well as a variational classical annealing trajectory (green dashed line), deviate from the trajectory of exact Boltzmann distributions.
  • Figure 2: Variational neural annealing protocols. (a) The variational classical annealing (VCA) algorithm steps. A warm-up step brings the initialized variational state (green dot) close to the minimum of the free energy (cyan dot) at a given value of the order parameter $M$. This step is followed by an annealing and a training step that brings the variational state back to the new free energy minimum. Repeating the last two steps until $T(t=1)=0$ (red dots) produces approximate solutions to $H_{\rm target}$ if the protocol is conducted slowly enough. This schematic illustration corresponds to annealing through a continuous phase transition with an order parameter $M$. (b) Variational quantum annealing (VQA). VQA includes a warm-up step, followed by an annealing and a training step, which brings the variational energy (green dot) closer to the new a ground state energy (cyan dot). We loop over the previous two steps until reaching the target ground state of $\hat{H}_{\rm target}$ (red dot) if annealing is performed slowly enough.
  • Figure 3: Variational neural annealing on a random Ising chain. Here we represent the residual energy per site $\epsilon_{\rm res}/N$ vs the number of annealing steps $N_{\rm annealing}$ for both VQA and VCA. The system sizes are $N = 32,64,128$. We use random positive couplings $J_{i,i+1} \in [0,1)$ (see text for more details). The error bars represent the one s.d. statistical uncertainty calculated over different disorder realizations Norris1940.
  • Figure 4: Benchmarking the two-dimensional Edwards-Anderson spin glass. (a) A comparison between VCA, VQA, RVQA, and CQO on a $10 \times 10$ lattice by plotting the residual energy per site vs $N_{\text{annealing}}$. For CQO, we report the residual energy per site vs the number of optimization steps $N_{\rm steps}$. (b) Comparison between SA, SQA with $P=20$ trotter slices, and VCA using a 2D tensorized RNN ansatz on a $40 \times 40$ lattice. The annealing speed is the same for SA, SQA and VCA.
  • Figure 5: Benchmarking SA, SQA ($P=100$ trotter slices) and VCA on the Sherrington-Kirkpatrick (SK) model and the Wishart planted ensemble (WPE). Panels (a),(b), and (c) display the residual energy per site as a function of $N_{\rm annealing}$. (a) The SK model with $N = 100$ spins. (b) WPE with $N=32$ spins and $\alpha = 0.5$. (c) WPE with $N=32$ spins and $\alpha = 0.25$. Panels (d), (e) and (f) display the residual energy histogram for each of the different techniques and models in panels (a),(b), and (c), respectively. The histograms use $25000$ data points for each method. Note that we choose a minimum threshold of $10^{-10}$ for $\epsilon_{\rm res}/N$, which is within our numerical accuracy.
  • ...and 4 more figures