Universality and kernel-adaptive training for classically trained, quantum-deployed generative models
Andrii Kurkin, Kevin Shen, Susanne Pielawa, Hao Wang, Vedran Dunjko
TL;DR
The paper analyzes IQP-QCBMs, highlighting non-universality in the baseline architecture and proving universality when hidden qubits are added, including exact universality with m = n+1 hidden qubits. It then introduces kernel-adaptive training that extends beyond fixed Gaussian MMD by leveraging a learnable spectral measure G_γ, with theoretical links between MMD convergence and distributional convergence. The authors provide a Bochner-based generalized MMD framework and establish consistency results, while also outlining inherent limitations of MMD and demonstrating practical gains via synthetic parity-check datasets. Empirically, kernel-adaptive training outperforms fixed Gaussian kernels, with improvements growing with dimensionality, suggesting scalable quantum-classical insights even without fault-tolerant quantum hardware.
Abstract
The instantaneous quantum polynomial (IQP) quantum circuit Born machine (QCBM) has been proposed as a promising quantum generative model over bitstrings. Recent works have shown that the training of IQP-QCBM is classically tractable w.r.t. the so-called Gaussian kernel maximum mean discrepancy (MMD) loss function, while maintaining the potential of a quantum advantage for sampling itself. Nonetheless, the model has a number of aspects where improvements would be important for more general utility: (1) the basic model is known to be not universal - i.e. it is not capable of representing arbitrary distributions, and it was not known whether it is possible to achieve universality by adding hidden (ancillary) qubits; (2) a fixed Gaussian kernel used in the MMD loss can cause training issues, e.g., vanishing gradients. In this paper, we resolve the first question and make decisive strides on the second. We prove that for an $n$-qubit IQP generator, adding $n + 1$ hidden qubits makes the model universal. For the latter, we propose a kernel-adaptive training method, where the kernel is adversarially trained. We show that in the kernel-adaptive method, the convergence of the MMD value implies weak convergence in distribution of the generator. We also analytically analyze the limitations of the MMD-based training method. Finally, we verify the performance benefits on the dataset crafted to spotlight improvements by the suggested method. The results show that kernel-adaptive training outperforms a fixed Gaussian kernel in total variation distance, and the gap increases with the dataset dimensionality. These modifications and analyses shed light on the limits and potential of these new quantum generative methods, which could offer the first truly scalable insights in the comparative capacities of classical versus quantum models, even without access to scalable quantum computers.
