Table of Contents
Fetching ...

Universality and kernel-adaptive training for classically trained, quantum-deployed generative models

Andrii Kurkin, Kevin Shen, Susanne Pielawa, Hao Wang, Vedran Dunjko

TL;DR

The paper analyzes IQP-QCBMs, highlighting non-universality in the baseline architecture and proving universality when hidden qubits are added, including exact universality with m = n+1 hidden qubits. It then introduces kernel-adaptive training that extends beyond fixed Gaussian MMD by leveraging a learnable spectral measure G_γ, with theoretical links between MMD convergence and distributional convergence. The authors provide a Bochner-based generalized MMD framework and establish consistency results, while also outlining inherent limitations of MMD and demonstrating practical gains via synthetic parity-check datasets. Empirically, kernel-adaptive training outperforms fixed Gaussian kernels, with improvements growing with dimensionality, suggesting scalable quantum-classical insights even without fault-tolerant quantum hardware.

Abstract

The instantaneous quantum polynomial (IQP) quantum circuit Born machine (QCBM) has been proposed as a promising quantum generative model over bitstrings. Recent works have shown that the training of IQP-QCBM is classically tractable w.r.t. the so-called Gaussian kernel maximum mean discrepancy (MMD) loss function, while maintaining the potential of a quantum advantage for sampling itself. Nonetheless, the model has a number of aspects where improvements would be important for more general utility: (1) the basic model is known to be not universal - i.e. it is not capable of representing arbitrary distributions, and it was not known whether it is possible to achieve universality by adding hidden (ancillary) qubits; (2) a fixed Gaussian kernel used in the MMD loss can cause training issues, e.g., vanishing gradients. In this paper, we resolve the first question and make decisive strides on the second. We prove that for an $n$-qubit IQP generator, adding $n + 1$ hidden qubits makes the model universal. For the latter, we propose a kernel-adaptive training method, where the kernel is adversarially trained. We show that in the kernel-adaptive method, the convergence of the MMD value implies weak convergence in distribution of the generator. We also analytically analyze the limitations of the MMD-based training method. Finally, we verify the performance benefits on the dataset crafted to spotlight improvements by the suggested method. The results show that kernel-adaptive training outperforms a fixed Gaussian kernel in total variation distance, and the gap increases with the dataset dimensionality. These modifications and analyses shed light on the limits and potential of these new quantum generative methods, which could offer the first truly scalable insights in the comparative capacities of classical versus quantum models, even without access to scalable quantum computers.

Universality and kernel-adaptive training for classically trained, quantum-deployed generative models

TL;DR

The paper analyzes IQP-QCBMs, highlighting non-universality in the baseline architecture and proving universality when hidden qubits are added, including exact universality with m = n+1 hidden qubits. It then introduces kernel-adaptive training that extends beyond fixed Gaussian MMD by leveraging a learnable spectral measure G_γ, with theoretical links between MMD convergence and distributional convergence. The authors provide a Bochner-based generalized MMD framework and establish consistency results, while also outlining inherent limitations of MMD and demonstrating practical gains via synthetic parity-check datasets. Empirically, kernel-adaptive training outperforms fixed Gaussian kernels, with improvements growing with dimensionality, suggesting scalable quantum-classical insights even without fault-tolerant quantum hardware.

Abstract

The instantaneous quantum polynomial (IQP) quantum circuit Born machine (QCBM) has been proposed as a promising quantum generative model over bitstrings. Recent works have shown that the training of IQP-QCBM is classically tractable w.r.t. the so-called Gaussian kernel maximum mean discrepancy (MMD) loss function, while maintaining the potential of a quantum advantage for sampling itself. Nonetheless, the model has a number of aspects where improvements would be important for more general utility: (1) the basic model is known to be not universal - i.e. it is not capable of representing arbitrary distributions, and it was not known whether it is possible to achieve universality by adding hidden (ancillary) qubits; (2) a fixed Gaussian kernel used in the MMD loss can cause training issues, e.g., vanishing gradients. In this paper, we resolve the first question and make decisive strides on the second. We prove that for an -qubit IQP generator, adding hidden qubits makes the model universal. For the latter, we propose a kernel-adaptive training method, where the kernel is adversarially trained. We show that in the kernel-adaptive method, the convergence of the MMD value implies weak convergence in distribution of the generator. We also analytically analyze the limitations of the MMD-based training method. Finally, we verify the performance benefits on the dataset crafted to spotlight improvements by the suggested method. The results show that kernel-adaptive training outperforms a fixed Gaussian kernel in total variation distance, and the gap increases with the dataset dimensionality. These modifications and analyses shed light on the limits and potential of these new quantum generative methods, which could offer the first truly scalable insights in the comparative capacities of classical versus quantum models, even without access to scalable quantum computers.

Paper Structure

This paper contains 25 sections, 15 theorems, 82 equations, 4 figures.

Key Result

Lemma 1

Given a parameterised IQP circuit $q_\theta$, an expectation value $\langle Z_\alpha \rangle_{q_\theta}$, and an error $\varepsilon\in\mathcal{O}(\mathrm{poly}(n^{-1}))$, there exists a classical algorithm that requires $\mathrm{poly}(n)$ time, and samples a random variable with standard deviation l

Figures (4)

  • Figure 1: Schematic diagram of IQP-QCBM with hidden qubits.
  • Figure 2: Example to show the expressivity enhancement via adding hidden/ancilla qubits for target distribution vector $p = \left(\frac{1}{3}, \frac{1}{3}, \frac{1}{3}, 0\right)$ over $\{0,1\}^2$. With just one hidden qubit, the total variation distance (TVD) between the generator and the target is greatly reduced over the training iterations.
  • Figure 3: Total variation distance between generator distribution and ground truth distribution, when trained with different kernels (mean $\pm$ standard deviation over $5$ runs) for 12-, 14-, and 16-bit synthetic parity-check datasets. Lowest achieved values are reported in the table. Bold indicates the best (minimal) TVD with statistical significance level of $0.01$.
  • Figure 4: Learning curves for (mean $\pm$ standard deviation over $5$ runs) on synthetic parity-check datasets of dimension $12$, $14$ and $16$. The first row shows the total variation distance computed between the true generated distribution and ground truth data distribution (same as Figure 2 in the main text). The second row shows the total variation distance computed between the empirical distribution of a batch of $1000$ generated data and the empirical distribution of the $2000$ training data. The third row shows the empirical MMD with respect to the $0$-bandwidth Gaussian kernel, based on $1000$ generated data and $2000$ training data.

Theorems & Definitions (35)

  • Definition 1: Quantum Circuit Born Machine (QCBM)
  • Definition 2: Parametrized instantaneous quantum polynomial (IQP) circuit
  • Definition 3: Universality of generative models
  • Definition 4: Maximum Mean Discrepancy (MMD)
  • Lemma 1
  • Definition 5: IQP-QCBM with hidden qubits
  • Lemma 2: Asymptotic approximate universality with $\pi$-phases
  • Theorem 1: Exact universality
  • proof : Proof sketch
  • Remark 1
  • ...and 25 more