Table of Contents
Fetching ...

Train on classical, deploy on quantum: scaling generative quantum machine learning to a thousand qubits

Erik Recio-Armengol, Shahnawaz Ahmed, Joseph Bowles

TL;DR

This work tackles the fundamental scalability bottleneck in variational quantum machine learning by proposing a scalable training paradigm for parameterised IQP circuits. Training is performed classically by rewriting the loss as a mixture of Pauli-$Z$ expectation values and optimising the squared maximum mean discrepancy $\text{MMD}^2(p, q_{\boldsymbol{\theta}})$ using automatic differentiation, enabling circuit sizes up to thousands of qubits. Crucially, sampling from the trained model remains a quantum bottleneck, offering potential quantum advantage when deployed on hardware. The authors show through extensive experiments on six binary datasets that the IQP-based models can learn high-dimensional distributions competitively with, and in some cases outperform, classical baselines, while attributing gains to the role of coherence and carefully designed initialisation and symmetry strategies. Overall, the paper demonstrates a viable path to scalable quantum generative learning that can be explored today at large scales, with practical implications for quantum data and beyond.

Abstract

We propose an approach to generative quantum machine learning that overcomes the fundamental scaling issues of variational quantum circuits. The core idea is to use a class of generative models based on instantaneous quantum polynomial circuits, which we show can be trained efficiently on classical hardware. Although training is classically efficient, sampling from these circuits is widely believed to be classically hard, and so computational advantages are possible when sampling from the trained model on quantum hardware. By combining our approach with a data-dependent parameter initialisation strategy, we do not encounter issues of barren plateaus and successfully circumvent the poor scaling of gradient estimation that plagues traditional approaches to quantum circuit optimisation. We investigate and evaluate our approach on a number of real and synthetic datasets, training models with up to one thousand qubits and hundreds of thousands of parameters. We find that the quantum models can successfully learn from high dimensional data, and perform surprisingly well compared to simple energy-based classical generative models trained with a similar amount of hyperparameter optimisation. Overall, our work demonstrates that a path to scalable quantum generative machine learning exists and can be investigated today at large scales.

Train on classical, deploy on quantum: scaling generative quantum machine learning to a thousand qubits

TL;DR

This work tackles the fundamental scalability bottleneck in variational quantum machine learning by proposing a scalable training paradigm for parameterised IQP circuits. Training is performed classically by rewriting the loss as a mixture of Pauli- expectation values and optimising the squared maximum mean discrepancy using automatic differentiation, enabling circuit sizes up to thousands of qubits. Crucially, sampling from the trained model remains a quantum bottleneck, offering potential quantum advantage when deployed on hardware. The authors show through extensive experiments on six binary datasets that the IQP-based models can learn high-dimensional distributions competitively with, and in some cases outperform, classical baselines, while attributing gains to the role of coherence and carefully designed initialisation and symmetry strategies. Overall, the paper demonstrates a viable path to scalable quantum generative learning that can be explored today at large scales, with practical implications for quantum data and beyond.

Abstract

We propose an approach to generative quantum machine learning that overcomes the fundamental scaling issues of variational quantum circuits. The core idea is to use a class of generative models based on instantaneous quantum polynomial circuits, which we show can be trained efficiently on classical hardware. Although training is classically efficient, sampling from these circuits is widely believed to be classically hard, and so computational advantages are possible when sampling from the trained model on quantum hardware. By combining our approach with a data-dependent parameter initialisation strategy, we do not encounter issues of barren plateaus and successfully circumvent the poor scaling of gradient estimation that plagues traditional approaches to quantum circuit optimisation. We investigate and evaluate our approach on a number of real and synthetic datasets, training models with up to one thousand qubits and hundreds of thousands of parameters. We find that the quantum models can successfully learn from high dimensional data, and perform surprisingly well compared to simple energy-based classical generative models trained with a similar amount of hyperparameter optimisation. Overall, our work demonstrates that a path to scalable quantum generative machine learning exists and can be investigated today at large scales.

Paper Structure

This paper contains 49 sections, 3 theorems, 84 equations, 13 figures, 4 tables.

Key Result

Proposition 1

Given a parameterised IQP circuit $q_{\boldsymbol{\theta}}$, an expectation value $\langle Z_{\boldsymbol{a}} \rangle_{q_{\boldsymbol{\theta}}}$ and an error $\epsilon=\text{poly}(n^{-1})$, there exists a classical algorithm that requires poly$(n)$ time and space, and samples a random variable with

Figures (13)

  • Figure 1: The method we use to train our quantum generative models. One first estimates a batch of expectation values $\{\langle Z_{\boldsymbol{a}_i}\rangle\}$ of Pauli Z words evaluated on the output distribution $q_{\boldsymbol{\theta}}(\boldsymbol{x})$ of the quantum circuit. The class of circuits we use (parameterised IQP circuits), admit an efficient classical algorithm for this task, which is therefore performed on classical hardware. This information is combined with a dataset sampled from a ground truth distribution $p$ to provide an unbiased estimate of the squared maximum mean discrepancy between $p$ and $q_{\boldsymbol{\theta}}(\boldsymbol{x})$. We use automatic differentiation to obtain estimates of gradients to train the circuit. Once the circuit is trained, the trained parameters $\boldsymbol{\theta}^*$ can be deployed on quantum hardware to generate samples. Since IQP circuits are believed to be hard to sample from classically, computational advantages are possible at this stage.
  • Figure 2: Parameterised IQP circuits consist of parameterised rotation gates whose generators are tensor products of Pauli-X operators.
  • Figure 3: Training loss plots from training the parameterised IQP model on each of the six datasets.
  • Figure 4: (left) The squared maximum mean discrepancy evaluated on a test set for each of the models for the 2D Ising data. Error bars denote one standard deviation. (right) The cumulative probability distribution returned by the KGEL test.
  • Figure 5: Distributions computed through kernel density estimation (blue curves) of the Ising energy (top) and magnetisation (bottom) for the true distribution and each of the trained models.
  • ...and 8 more figures

Theorems & Definitions (11)

  • Definition 3.1: parameterised IQP circuit
  • Definition 3.2: MMD loss function
  • Definition 3.3: median heuristic
  • Proposition 1: nest2010simulating
  • Proposition 2: rudolph2024trainability
  • Proposition 3: Unbiased estimates of the MMD$^2$
  • Definition 5.1: Stochastic bitflip circuit
  • Definition 7.1: log likelihood of a test set
  • Definition 7.2: MMD$^2$ with respect to a test set
  • Definition 7.3: Kernel Generalized Empirical Likelihood
  • ...and 1 more