Table of Contents
Fetching ...

Generalized Probabilistic Approximate Optimization Algorithm

Abdelrahman S. Abdelrahman, Shuvro Chowdhury, Flaviano Morone, Kerem Y. Camsari

TL;DR

This work introduces PAOA, a generalized variational Monte Carlo framework that learns non-equilibrium annealing strategies for Ising-like problems on probabilistic hardware. By coupling a classical outer optimization with a p-bit inner sampler, PAOA unifies global, local, and fully parameterized annealing schedules within a Markov-flow formulation, recovering simulated annealing as a limit and enabling on-chip FPGA annealing. Empirical results demonstrate competitive performance against QAOA on the SK model, hardware acceleration with substantial speedups, and the discovery of heterogeneous annealing schedules in heavy-tailed SK variants. The approach offers a scalable, hardware-compatible classical alternative for large-scale optimization and provides a tool for discovering novel algorithmic heuristics in disordered systems.

Abstract

We introduce a generalized \textit{Probabilistic Approximate Optimization Algorithm (PAOA)}, a classical variational Monte Carlo framework that extends and formalizes prior work by Weitz \textit{et al.}~\cite{Combes_2023}, enabling parameterized and fast sampling on present-day Ising machines and probabilistic computers. PAOA operates by iteratively modifying the couplings of a network of binary stochastic units, guided by cost evaluations from independent samples. We establish a direct correspondence between derivative-free updates and the gradient of the full Markov flow over the exponentially large state space, showing that PAOA admits a principled variational formulation. Simulated annealing emerges as a limiting case under constrained parameterizations, and we implement this regime on an FPGA-based probabilistic computer with on-chip annealing to solve large 3D spin-glass problems. Benchmarking PAOA against QAOA on the canonical 26-spin Sherrington-Kirkpatrick model with matched parameters reveals superior performance for PAOA. We show that PAOA naturally extends simulated annealing by optimizing multiple temperature profiles, leading to improved performance over SA on heavy-tailed problems such as SK-Lévy.

Generalized Probabilistic Approximate Optimization Algorithm

TL;DR

This work introduces PAOA, a generalized variational Monte Carlo framework that learns non-equilibrium annealing strategies for Ising-like problems on probabilistic hardware. By coupling a classical outer optimization with a p-bit inner sampler, PAOA unifies global, local, and fully parameterized annealing schedules within a Markov-flow formulation, recovering simulated annealing as a limit and enabling on-chip FPGA annealing. Empirical results demonstrate competitive performance against QAOA on the SK model, hardware acceleration with substantial speedups, and the discovery of heterogeneous annealing schedules in heavy-tailed SK variants. The approach offers a scalable, hardware-compatible classical alternative for large-scale optimization and provides a tool for discovering novel algorithmic heuristics in disordered systems.

Abstract

We introduce a generalized \textit{Probabilistic Approximate Optimization Algorithm (PAOA)}, a classical variational Monte Carlo framework that extends and formalizes prior work by Weitz \textit{et al.}~\cite{Combes_2023}, enabling parameterized and fast sampling on present-day Ising machines and probabilistic computers. PAOA operates by iteratively modifying the couplings of a network of binary stochastic units, guided by cost evaluations from independent samples. We establish a direct correspondence between derivative-free updates and the gradient of the full Markov flow over the exponentially large state space, showing that PAOA admits a principled variational formulation. Simulated annealing emerges as a limiting case under constrained parameterizations, and we implement this regime on an FPGA-based probabilistic computer with on-chip annealing to solve large 3D spin-glass problems. Benchmarking PAOA against QAOA on the canonical 26-spin Sherrington-Kirkpatrick model with matched parameters reveals superior performance for PAOA. We show that PAOA naturally extends simulated annealing by optimizing multiple temperature profiles, leading to improved performance over SA on heavy-tailed problems such as SK-Lévy.

Paper Structure

This paper contains 17 sections, 32 equations, 13 figures, 4 tables, 2 algorithms.

Figures (13)

  • Figure 1: Overview of PAOA. A hybrid classical--probabilistic architecture iteratively updates a weight matrix $J$ and bias vector $h$ using feedback from a probabilistic computer. The p-computer samples from a distribution defined by $(J,h)$, approximating the exact Markov flow. The resulting samples $\hat{\boldsymbol{\rho}}_p$ are used to evaluate a cost function, which the classical optimizer minimizes by adjusting the variational parameters. Here, $\boldsymbol{\rho}$ denotes the exact probability distribution over spin configurations, $\hat{\boldsymbol{\rho}}$ the sampled approximation, $p$ the layer number, $M$ the number of total spins represented in the ansatz (including possible hidden variables), and $\{m\}$ a specific spin configuration (state). The cost function is typically the energy of a spin glass mapped to an optimization problem (e.g., Eq. \ref{['eq3']}) but can also be a likelihood function if PAOA is learning from data (e.g., Eq. \ref{['cost_function']}). Importantly, the ansatz size $M$ can exceed the original problem size $N$ since hidden variables may be introduced to increase the representation. In this work, we use $M=N$ in all experiments unless noted. At convergence ($p=k$), the distribution concentrates around low-energy solutions.
  • Figure 2: Majority gate benchmark: comparing analytical and sampling-based PAOA. (a) Fully connected four-node network used to implement the majority gate $Y$ = $\text{MAJ}(A,B,C)$, where $Y$ = $A \lor B$ if $C$ = $1$ and $Y$ = $A \land B$ if $C$ = $0$. The table lists the eight valid input-output combinations, labeled by their decimal encoding. (b) Training loss during optimization, comparing exact gradients (blue) to gradient-free optimization using COBYLA (orange) with $10^7$ MCMC samples. Both methods converge to the same minimum. (c) Time evolution of the exact distribution over two layers using analytical Markov matrices. Initial uniform distribution ($p$ = $0$) is transformed into a peaked distribution ($p$ = $2$) concentrated on the correct truth table entries. (d) Corresponding evolution using MCMC samples and COBYLA. The approximated distributions closely match the exact dynamics.
  • Figure 3: Discovering simulated annealing with PAOA on a 3D spin-glass problem. (a) The hybrid architecture combines an FPGA-based p-computer for MCMC sampling, where $\{m\}$ represents the sampled state, with a classical CPU that optimizes the global annealing schedule ($\beta^{(k)}$) using the average energy ($\langle E \rangle$) computed across independent experiments/runs. (b) Energy histograms on a single $L^3$ = $6^3$ instance before ($\beta$=$2$, red) and after (best $\beta$ schedule, blue) optimization. The optimized schedule shifts the distribution (shaded area) toward the putative ground state (energy = $-360$), increasing its discovery frequency out of $N_E$=$10^5$ independent runs. (c) Optimized schedules for $p$-layer architectures where $p \in \{5, 10, 15\}$. Starting from a flat initial schedule ($\beta$ = $2$, green), PAOA consistently discovers cooling schedules (best shown in red) that resemble SA. Faint curves show all 100 optimization runs. The insets display the sorted success probabilities, demonstrating that deeper architectures improve the average success probability ($\langle \mathrm{p}_s\rangle$) and reduce run-to-run variability.
  • Figure 4: PAOA vs QAOA on the Sherrington-Kirkpatrick model. (a) PAOA results (left) using two-schedule ansatz ($\beta_1$ and $\beta_2$) with $2p$ parameters compared against QAOA (right) with $2p$ parameters ($\gamma$ and $\beta$). For each depth $p$, the PAOA schedules are optimized on a separate training set; the average schedule is then applied to 30 random test instances of size $N$ = $26$ without retraining. QAOA results use optimal parameters from prior work farhi2022qaoaQAOA_max_cut_farhi. Red crosses denote averages across the 30 instances, blue dots show individual instance energies, and the solid green line indicates the average ground-state energy per spin. (b) Approximation ratios of PAOA (red squares) and QAOA (blue circles), averaged across the 30 instances. Error bars indicate the 95% confidence intervals computed from $10^4$ bootstrap samples with replacement.
  • Figure 5: Learning a variational principle using a PAOA double‑schedule ansatz. (a) Heavy‑tailed SK: per‑node coupling strengths for $50$ instances of $N{=}50$, sorted in descending order; the heavy‑tailed distribution separates heavy (assigned $\beta_2$ schedule) and light (assigned $\beta_1$ schedule) nodes. (b) PAOA training with two schedules ($2p$ parameters) assigned to heavy and light nodes based on their coupling strengths, initialized from an optimized single annealing schedule (cyan). Curves are averaged across 50 instances; red/blue denote heavy/light nodes. (c) The extrapolated double schedule suggested by (b): the heavy‑node schedule is extended to more layers, and the light‑node schedule is scaled up following PAOA’s guidance. (d) Average success probability over 500 instances comparing single‑schedule SA (blue) and double‑schedule PAOA (red) for $N\,{=}\,50$. Error bars are 95% confidence intervals from $10^5$ bootstrap samples with replacement. (e) Success probability showing per-instance comparison between PAOA ($2p$) and SA ($p$) at $p=5\times 10^5$.
  • ...and 8 more figures