Table of Contents
Fetching ...

Multi-Mode Quantum Annealing for Variational Autoencoders with General Boltzmann Priors

Gilhan Kim, Daniel K. Park

Abstract

Variational autoencoders (VAEs) learn compact latent representations of complex data, but their generative capacity is fundamentally constrained by the choice of prior distribution over the latent space. Energy-based priors offer a principled way to move beyond factorized assumptions and capture structured interactions among latent variables, yet training such priors at scale requires accurate and efficient sampling from intractable distributions. Here we present Boltzmann-machine--prior VAEs (BM-VAEs) trained using quantum annealing--based sampling in three distinct operational modes within a single generative system. During training, diabatic quantum annealing (DQA) provides unbiased Boltzmann samples for gradient estimation of the energy-based prior; for unconditional generation, slower quantum annealing (QA) concentrates samples near low-energy minima; for conditional generation, bias fields are added to direct sampling toward attribute-specific regions of the energy landscape (c-QA). Using up to 2000 qubits on a D-Wave Advantage2 processor, we demonstrate stable and efficient training across multiple datasets, with faster convergence and lower reconstruction loss than a Gaussian-prior VAE. The learned Boltzmann prior enables unconditional generation by sampling directly from the energy-based latent distribution, a capability that plain autoencoders lack, and conditional generation through latent biasing that leverages the learned pairwise interactions.

Multi-Mode Quantum Annealing for Variational Autoencoders with General Boltzmann Priors

Abstract

Variational autoencoders (VAEs) learn compact latent representations of complex data, but their generative capacity is fundamentally constrained by the choice of prior distribution over the latent space. Energy-based priors offer a principled way to move beyond factorized assumptions and capture structured interactions among latent variables, yet training such priors at scale requires accurate and efficient sampling from intractable distributions. Here we present Boltzmann-machine--prior VAEs (BM-VAEs) trained using quantum annealing--based sampling in three distinct operational modes within a single generative system. During training, diabatic quantum annealing (DQA) provides unbiased Boltzmann samples for gradient estimation of the energy-based prior; for unconditional generation, slower quantum annealing (QA) concentrates samples near low-energy minima; for conditional generation, bias fields are added to direct sampling toward attribute-specific regions of the energy landscape (c-QA). Using up to 2000 qubits on a D-Wave Advantage2 processor, we demonstrate stable and efficient training across multiple datasets, with faster convergence and lower reconstruction loss than a Gaussian-prior VAE. The learned Boltzmann prior enables unconditional generation by sampling directly from the energy-based latent distribution, a capability that plain autoencoders lack, and conditional generation through latent biasing that leverages the learned pairwise interactions.

Paper Structure

This paper contains 6 sections, 10 equations, 6 figures.

Figures (6)

  • Figure 1: Schematic illustration of a variational autoencoder with a Boltzmann prior. The encoder maps an input $x$ to a logit vector $\mu$, whose components determine the Bernoulli probabilities in the approximate posterior $q_\phi(z|x)$ over binary latent variables $z \in \{\pm 1\}^K$. A latent sample $z$ drawn from this posterior is passed to the decoder to produce the reconstruction $\tilde{x}$. The Boltzmann prior $p_\psi(z)$ is trained to match the aggregated posterior $\bar{q}(z)=\mathbb{E}_x[q_\phi(z|x)]$, enabling generation by sampling from the learned energy-based latent distribution.
  • Figure 2: Three quantum annealing modes applied to the same learned energy landscape. Blue (DQA): diabatic quantum annealing yields samples that approximately follow a Boltzmann distribution over the landscape and are used for gradient estimation during training. Red (QA): slower quantum annealing localizes samples near low-energy minima for unconditional generation. Green (c-QA): conditional quantum annealing with external bias fields steers sampling toward a specific low-energy region associated with a desired attribute. See Figs. \ref{['fig:unconditional']} and \ref{['fig:conditional']} for generated samples.
  • Figure 3: Training curves of BM-VAE and Gaussian-prior VAE (G-VAE) on MNIST (left), Fashion-MNIST (center), and CelebA (right). The vertical axis shows the binary cross-entropy reconstruction loss. Solid lines indicate the mean over 10 independent runs and shaded regions indicate one standard deviation, where run-to-run variability arises from random parameter initialization and the stochastic nature of quantum annealing samples. Both models use the same encoder--decoder architecture and latent dimensionality for each dataset, and differ only in the choice of latent prior.
  • Figure 4: Unconditional samples from the learned Boltzmann prior on CelebA ($128\times128$, $K=2000$ latent variables). Samples are generated on the D-Wave Advantage2 processor using QA (Mode 2), which localizes sampling near low-energy minima of the learned energy landscape. No additional denoising or post-processing is applied.
  • Figure 5: Conditional generation on CelebA using the attribute-average encoder output for Bangs. Row 1: the binarized encoder output $\mathrm{sign}(\boldsymbol{\mu})$ is decoded directly without quantum annealing, producing a single deterministic but visually rigid output. Row 2: c-QA (Mode 3) with the learned couplings $J$ and bias fields $h$ derived from $\boldsymbol{\mu}_{\mathrm{attr}}$. The pairwise interactions of the Boltzmann prior propagate the attribute bias across latent variables, yielding samples that are both diverse and visually consistent.
  • ...and 1 more figures