Table of Contents
Fetching ...

Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

Kevin Shen, Susanne Pielawa, Vedran Dunjko, Hao Wang

TL;DR

This paper addresses the trainability of IQP-QCBMs, a quantum generative model class with potential sampling hardness advantages, by deriving closed-form variances for MMD gradient at initialization and introducing the critical rank $r^{\bm{a}}$ to predict barren plateaus. It analyzes four circuit architectures, shows BP can be mitigated by kernel choice (partial-spectrum trainability), and demonstrates that a sparse Erdős–Rényi architecture can be both trainable and classically hard to simulate. The authors further show Gaussian initialization yields polynomial scaling of gradients across frequencies, broadening viable architectures, and establish the existence of a trainable, non-dequantizable IQP-QCBM under anti-concentration. The results provide concrete design principles linking kernel spectra, circuit topology, and initialization to achievable training performance and potential quantum advantage.

Abstract

Instantaneous quantum polynomial quantum circuit Born machines (IQP-QCBMs) have been proposed as quantum generative models with a classically tractable training objective based on the maximum mean discrepancy (MMD) and a potential quantum advantage motivated by sampling-complexity arguments, making them an exciting model worth deeper investigation. While recent works have further proven the universality of a (slightly generalized) model, the next immediate question pertains to its trainability, i.e., whether it suffers from the exponentially vanishing loss gradients, known as the barren plateau issue, preventing effective use, and how regimes of trainability overlap with regimes of possible quantum advantage. Here, we provide significant strides in these directions. To study the trainability at initialization, we analytically derive closed-form expressions for the variances of the partial derivatives of the MMD loss function and provide general upper and lower bounds. With uniform initialization, we show that barren plateaus depend on the generator set and the spectrum of the chosen kernel. We identify regimes in which low-weight-biased kernels avoid exponential gradient suppression in structured topologies. Also, we prove that a small-variance Gaussian initialization ensures polynomial scaling for the gradient under mild conditions. As for the potential quantum advantage, we further argue, based on previous complexity-theoretic arguments, that sparse IQP families can output a probability distribution family that is classically intractable, and that this distribution remains trainable at initialization at least at lower-weight frequencies.

Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

TL;DR

This paper addresses the trainability of IQP-QCBMs, a quantum generative model class with potential sampling hardness advantages, by deriving closed-form variances for MMD gradient at initialization and introducing the critical rank to predict barren plateaus. It analyzes four circuit architectures, shows BP can be mitigated by kernel choice (partial-spectrum trainability), and demonstrates that a sparse Erdős–Rényi architecture can be both trainable and classically hard to simulate. The authors further show Gaussian initialization yields polynomial scaling of gradients across frequencies, broadening viable architectures, and establish the existence of a trainable, non-dequantizable IQP-QCBM under anti-concentration. The results provide concrete design principles linking kernel spectra, circuit topology, and initialization to achievable training performance and potential quantum advantage.

Abstract

Instantaneous quantum polynomial quantum circuit Born machines (IQP-QCBMs) have been proposed as quantum generative models with a classically tractable training objective based on the maximum mean discrepancy (MMD) and a potential quantum advantage motivated by sampling-complexity arguments, making them an exciting model worth deeper investigation. While recent works have further proven the universality of a (slightly generalized) model, the next immediate question pertains to its trainability, i.e., whether it suffers from the exponentially vanishing loss gradients, known as the barren plateau issue, preventing effective use, and how regimes of trainability overlap with regimes of possible quantum advantage. Here, we provide significant strides in these directions. To study the trainability at initialization, we analytically derive closed-form expressions for the variances of the partial derivatives of the MMD loss function and provide general upper and lower bounds. With uniform initialization, we show that barren plateaus depend on the generator set and the spectrum of the chosen kernel. We identify regimes in which low-weight-biased kernels avoid exponential gradient suppression in structured topologies. Also, we prove that a small-variance Gaussian initialization ensures polynomial scaling for the gradient under mild conditions. As for the potential quantum advantage, we further argue, based on previous complexity-theoretic arguments, that sparse IQP families can output a probability distribution family that is classically intractable, and that this distribution remains trainable at initialization at least at lower-weight frequencies.
Paper Structure (19 sections, 8 theorems, 55 equations, 1 figure, 2 tables)

This paper contains 19 sections, 8 theorems, 55 equations, 1 figure, 2 tables.

Key Result

Proposition 2.2

Consider an IQP model and its output distribution $q_{\bm{\theta}}$ defined in def: IQP. Denote by $\Lambda$ the Fourier transform of a stationary, bounded kernel function $k$ over $\mathbb{F}_2^n$. Given a target distribution $p$ over $\mathbb{F}_2^n$, the MMD loss between $p$ and $q_{\bm{\theta}}$ where $C^{\bm{a}}_p = \operatorname*{\mathbb{E}}_{{\bm{x}} \sim p}[(-1)^{{\bm{x}} \cdot {\bm{a}}}]$

Figures (1)

  • Figure 1: Illustration of the IQP architectures considered in this paper: product state (\ref{['eg:product']}), 2D lattice (\ref{['eg:lattice']}), sparse Erdős-Rényi graph (\ref{['eg:sparse']}), and complete graph (\ref{['eg:complete']}).

Theorems & Definitions (25)

  • Definition 2.1: IQP-QCBM
  • Proposition 2.2: MMD loss in IQP-QCBM
  • Definition 2.3: Barren plateau
  • Theorem 3.2: Variance of characteristic function values and their partial derivatives
  • Proposition 3.4: Decomposition of MMD partial derivative variance and average-case lower bound
  • Proposition 3.5: Upper bound for MMD partial derivative variance
  • Definition 3.6: Partial-spectrum trainability at initialization
  • Remark 3.7: Partial-spectrum trainability implies the existence of a spectral density that avoids barren plateaus at initialization
  • Theorem 3.8: Characteristic function value partial derivative variance under uniform initialization
  • Example 3.9: Product state
  • ...and 15 more