Characterizing Trainability of Instantaneous Quantum Polynomial Circuit Born Machines
Kevin Shen, Susanne Pielawa, Vedran Dunjko, Hao Wang
TL;DR
This paper addresses the trainability of IQP-QCBMs, a quantum generative model class with potential sampling hardness advantages, by deriving closed-form variances for MMD gradient at initialization and introducing the critical rank $r^{\bm{a}}$ to predict barren plateaus. It analyzes four circuit architectures, shows BP can be mitigated by kernel choice (partial-spectrum trainability), and demonstrates that a sparse Erdős–Rényi architecture can be both trainable and classically hard to simulate. The authors further show Gaussian initialization yields polynomial scaling of gradients across frequencies, broadening viable architectures, and establish the existence of a trainable, non-dequantizable IQP-QCBM under anti-concentration. The results provide concrete design principles linking kernel spectra, circuit topology, and initialization to achievable training performance and potential quantum advantage.
Abstract
Instantaneous quantum polynomial quantum circuit Born machines (IQP-QCBMs) have been proposed as quantum generative models with a classically tractable training objective based on the maximum mean discrepancy (MMD) and a potential quantum advantage motivated by sampling-complexity arguments, making them an exciting model worth deeper investigation. While recent works have further proven the universality of a (slightly generalized) model, the next immediate question pertains to its trainability, i.e., whether it suffers from the exponentially vanishing loss gradients, known as the barren plateau issue, preventing effective use, and how regimes of trainability overlap with regimes of possible quantum advantage. Here, we provide significant strides in these directions. To study the trainability at initialization, we analytically derive closed-form expressions for the variances of the partial derivatives of the MMD loss function and provide general upper and lower bounds. With uniform initialization, we show that barren plateaus depend on the generator set and the spectrum of the chosen kernel. We identify regimes in which low-weight-biased kernels avoid exponential gradient suppression in structured topologies. Also, we prove that a small-variance Gaussian initialization ensures polynomial scaling for the gradient under mild conditions. As for the potential quantum advantage, we further argue, based on previous complexity-theoretic arguments, that sparse IQP families can output a probability distribution family that is classically intractable, and that this distribution remains trainable at initialization at least at lower-weight frequencies.
