Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

Kevin Shen; Susanne Pielawa; Vedran Dunjko; Hao Wang

Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

Kevin Shen, Susanne Pielawa, Vedran Dunjko, Hao Wang

TL;DR

This paper addresses the trainability of IQP-QCBMs, a quantum generative model class with potential sampling hardness advantages, by deriving closed-form variances for MMD gradient at initialization and introducing the critical rank $r^{\bm{a}}$ to predict barren plateaus. It analyzes four circuit architectures, shows BP can be mitigated by kernel choice (partial-spectrum trainability), and demonstrates that a sparse Erdős–Rényi architecture can be both trainable and classically hard to simulate. The authors further show Gaussian initialization yields polynomial scaling of gradients across frequencies, broadening viable architectures, and establish the existence of a trainable, non-dequantizable IQP-QCBM under anti-concentration. The results provide concrete design principles linking kernel spectra, circuit topology, and initialization to achievable training performance and potential quantum advantage.

Abstract

Instantaneous quantum polynomial quantum circuit Born machines (IQP-QCBMs) have been proposed as quantum generative models with a classically tractable training objective based on the maximum mean discrepancy (MMD) and a potential quantum advantage motivated by sampling-complexity arguments, making them an exciting model worth deeper investigation. While recent works have further proven the universality of a (slightly generalized) model, the next immediate question pertains to its trainability, i.e., whether it suffers from the exponentially vanishing loss gradients, known as the barren plateau issue, preventing effective use, and how regimes of trainability overlap with regimes of possible quantum advantage. Here, we provide significant strides in these directions. To study the trainability at initialization, we analytically derive closed-form expressions for the variances of the partial derivatives of the MMD loss function and provide general upper and lower bounds. With uniform initialization, we show that barren plateaus depend on the generator set and the spectrum of the chosen kernel. We identify regimes in which low-weight-biased kernels avoid exponential gradient suppression in structured topologies. Also, we prove that a small-variance Gaussian initialization ensures polynomial scaling for the gradient under mild conditions. As for the potential quantum advantage, we further argue, based on previous complexity-theoretic arguments, that sparse IQP families can output a probability distribution family that is classically intractable, and that this distribution remains trainable at initialization at least at lower-weight frequencies.

Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

TL;DR

to predict barren plateaus. It analyzes four circuit architectures, shows BP can be mitigated by kernel choice (partial-spectrum trainability), and demonstrates that a sparse Erdős–Rényi architecture can be both trainable and classically hard to simulate. The authors further show Gaussian initialization yields polynomial scaling of gradients across frequencies, broadening viable architectures, and establish the existence of a trainable, non-dequantizable IQP-QCBM under anti-concentration. The results provide concrete design principles linking kernel spectra, circuit topology, and initialization to achievable training performance and potential quantum advantage.

Abstract

Paper Structure (19 sections, 8 theorems, 55 equations, 1 figure, 2 tables)

This paper contains 19 sections, 8 theorems, 55 equations, 1 figure, 2 tables.

Introduction
Background and related work
Main results
Closed-form expression of MMD partial derivatives
Importance of kernel selection
Presence and absence of barren plateaus under uniform initialization
Absence of barren plateaus under Gaussian initialization
Classical non-simulability and trainability
Discussion
Conclusions
Acknowledgements
Proof of Theorems
Proof of \ref{['thm: single symmetric']}
Proof of \ref{['prop:lower']}
Proof of \ref{['prop:upper']}
...and 4 more sections

Key Result

Proposition 2.2

Consider an IQP model and its output distribution $q_{\bm{\theta}}$ defined in def: IQP. Denote by $\Lambda$ the Fourier transform of a stationary, bounded kernel function $k$ over $\mathbb{F}_2^n$. Given a target distribution $p$ over $\mathbb{F}_2^n$, the MMD loss between $p$ and $q_{\bm{\theta}}$ where $C^{\bm{a}}_p = \operatorname*{\mathbb{E}}_{{\bm{x}} \sim p}[(-1)^{{\bm{x}} \cdot {\bm{a}}}]$

Figures (1)

Figure 1: Illustration of the IQP architectures considered in this paper: product state (\ref{['eg:product']}), 2D lattice (\ref{['eg:lattice']}), sparse Erdős-Rényi graph (\ref{['eg:sparse']}), and complete graph (\ref{['eg:complete']}).

Theorems & Definitions (25)

Definition 2.1: IQP-QCBM
Proposition 2.2: MMD loss in IQP-QCBM
Definition 2.3: Barren plateau
Theorem 3.2: Variance of characteristic function values and their partial derivatives
Proposition 3.4: Decomposition of MMD partial derivative variance and average-case lower bound
Proposition 3.5: Upper bound for MMD partial derivative variance
Definition 3.6: Partial-spectrum trainability at initialization
Remark 3.7: Partial-spectrum trainability implies the existence of a spectral density that avoids barren plateaus at initialization
Theorem 3.8: Characteristic function value partial derivative variance under uniform initialization
Example 3.9: Product state
...and 15 more

Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

TL;DR

Abstract

Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (25)