Shadows of quantum machine learning

Sofiene Jerbi; Casper Gyurik; Simon C. Marshall; Riccardo Molteni; Vedran Dunjko

Shadows of quantum machine learning

Sofiene Jerbi, Casper Gyurik, Simon C. Marshall, Riccardo Molteni, Vedran Dunjko

TL;DR

Shadows of quantum machine learning investigates whether quantum resources restricted to the training phase can yield practical advantages when deploying models classically. It introduces flipped models, where data-encoding and trainable quantum states are interchanged, and formalizes shadow models as a framework that uses quantum-generated advice for classical evaluation. The authors prove universality of flipped models for classically deployed quantum ML, establish a quantum advantage for shadow models under cryptographic hardness assumptions, and show not all quantum models are shadowfiable under standard complexity assumptions. The work clarifies the computational landscape of shadow-based quantum ML, suggesting a feasible route for quantum-enhanced learning in real-world settings and outlining future directions in state-aware shadow tomography and broader applicability.

Abstract

Quantum machine learning is often highlighted as one of the most promising practical applications for which quantum computers could provide a computational advantage. However, a major obstacle to the widespread use of quantum machine learning models in practice is that these models, even once trained, still require access to a quantum computer in order to be evaluated on new data. To solve this issue, we introduce a new class of quantum models where quantum resources are only required during training, while the deployment of the trained model is classical. Specifically, the training phase of our models ends with the generation of a 'shadow model' from which the classical deployment becomes possible. We prove that: i) this class of models is universal for classically-deployed quantum machine learning; ii) it does have restricted learning capacities compared to 'fully quantum' models, but nonetheless iii) it achieves a provable learning advantage over fully classical learners, contingent on widely-believed assumptions in complexity theory. These results provide compelling evidence that quantum machine learning can confer learning advantages across a substantially broader range of scenarios, where quantum computers are exclusively employed during the training phase. By enabling classical deployment, our approach facilitates the implementation of quantum machine learning models in various practical contexts.

Shadows of quantum machine learning

TL;DR

Abstract

Paper Structure (37 sections, 16 theorems, 78 equations, 6 figures)

This paper contains 37 sections, 16 theorems, 78 equations, 6 figures.

Introduction
The flipped model
Flipped model definition
Properties of flipped models
Quantum advantage of a shadow model
General shadow models
Shadow models beyond Fourier
All shadow models are shadows of flipped models
Not all quantum models are shadowfiable
Discussion
Formal definitions
Linear models
Shadow models
Complexity classes
Properties of flipped models
...and 22 more sections

Key Result

Theorem 1

There exists a learning task where a shadow model first trained using a quantum computer then evaluated classically on new input data, can achieve an arbitrarily good learning performance, while any fully classical model cannot do significantly better than random guessing, under the hardness of clas

Figures (6)

Figure 1: Quantum and shadow models. (left) Conventional quantum models can be expressed as inner products between a data-encoding quantum state $\rho(\bm{x})$ and a parametrized observable $O(\bm{\theta})$. The resulting linear model $f_{\bm{\theta}}(\bm{x})= \Tr[\rho(\bm{x})O(\bm{\theta})]$ naturally corresponds to a quantum computation, depicted here. (middle) We define flipped models $f_{\bm{\theta}}(\bm{x})= \Tr[\rho(\bm{\theta})O(\bm{x})]$ as quantum linear models where the role of the quantum state $\rho(\bm{\theta})$ and the observable $O(\bm{x})$ is flipped compared to conventional models. (right) Flipped models are associated to natural shadow models: one can use techniques from shadow tomography to construct a classical representation $\hat{\rho}(\bm{\theta})$ of the parametrized state $\rho(\bm{\theta})$ (during the shadowing phase), such that, for encoding observables $O(\bm{x})$ that are classically representable (e.g., linear combinations of Pauli observables), $\hat{\rho}(\bm{\theta})$ can be used by a classical algorithm to evaluate the model $f_{\bm{\theta}}(\bm{x})$ on new input data (during the evaluation phase). More generally, a shadow model is defined by (i) a shadowing phase where a (bit-string) advice $\omega(\bm{\theta})$ is generated by the evaluation of multiple quantum circuits $W_1(\bm{\theta}), \ldots, W_M(\bm{\theta})$, and (ii) an evaluation phase where this advice is used by a classical algorithm $\mathcal{A}$, along with new input data $\bm{x}$ to evaluate their labels $\widetilde{f}_{\bm{\theta}}(\bm{x})$. In \ref{['sec:general-shadow']}, we show that under this general definition, all shadow models are shadows of flipped models.
Figure 2: Separations between classical, shadow, and quantum models. Under the assumption that the discrete cube root (DCR) cannot be computed classically in polynomial time, we have a separation between shadow models (captured by the class BPP/qgenpoly) and classical models (in BPP). Under the assumption that there exist functions that can be computed in quantum polynomial time but not in classical polynomial time with the help of advice (i.e., BQP$\not\subset$P/poly), we have a separation between quantum models (universal for BQP) and shadow models (BPP/qgenpoly). A candidate function for this separation is the discrete logarithm (DLP).
Figure 3: Flipped evaluation of a conventional model
Figure 4: A visualization of the functions involved in the quantum advantage learning task. The core functions of this task map $\mathbb{Z}_{N}=\{0,\ldots,N-1\}$ to itself, for $N$ a large semiprime. a) In feature space, data is linearly separable by a hyperplane parametrized by a certain $s\in\mathbb{Z}_{N}$. One can efficiently transform data $y$ in feature space into its corresponding data $x$ in input space via the "discrete cube" function $x = y^3 \text{ mod } N$. b) To a fully classical learner, data in input space looks randomly labeled, as inverting it back to feature space via the discrete cube root function $y = \sqrt[3]{x} \text{ mod } N$ is believed to be classically intractable. However, a shadow model can make use of the trap-door property of the discrete cube root function to efficiently compute a key $d\in\mathbb{Z}_{N}$ using a quantum computer and classically map data to feature space through the transformation $y = x^d \text{ mod } N$.
Figure 5: All shadow models can be expressed as shadowfiable flipped models. a) A shadow model consists of $M$ unitary circuits $W_i(\bm{\theta})$ that can be chosen adaptively, and that generate advice $\omega_i(\theta)$ from computational basis measurements of the states $W_i(\bm{\theta})\ket{0}^{m}$. This advice, along with a (binary description of) an input $\bm{x}\in\mathbb{R}^d$ are processed by a classical algorithm $\mathcal{A}$ to compute an approximation $\widetilde{f}_{\bm{\theta}}$ of the shadowfiable model $f_{\bm{\theta}}$. b) A coherent implementation of this shadow model, where the unitaries $W_i(\bm{\theta})$ are applied on different $m$-qubit registers, and coherently controlled by previous registers (for adaptivity). These $M$ registers constitute the coherent encoding of the advice $\ket{\omega(\bm{\theta})}$. The algorithm $\mathcal{A}$ can then be simulated by a reversible quantum computation $U_{\mathcal{A}}$ (see Sec. 3.2.5. in nielsen00) that processes a binary encoding $\ket{\bm{x}}$ of $\bm{x}$ and the coherent advice $\ket{\omega(\bm{\theta})}$ (either directly or indirectly via controlled operations that imprint $\ket{\omega(\bm{\theta})}$ on an auxiliary register). This coherent implementation of the shadow model can be viewed as a shadowfiable flipped model $g_{\bm{\theta}}(\bm{x})=\Tr[\rho(\bm{\theta})O(\bm{x})]$, such that one evaluation of this model samples an advice $\omega(\bm{\theta})$ and evaluates $\mathcal{A}(\bm{x},\omega(\bm{\theta}))$ for that advice and a given $\bm{x}$.
...and 1 more figures

Theorems & Definitions (37)

Theorem 1: Quantum advantage (informal)
Definition 2: General shadow model
Definition 3: Shadowfiable model
Lemma 4: Flipped models are shadow-universal
Theorem 5: Not all shadowfiable
Definition A.1: Conventional linear model
Definition A.2: Flipped model
Definition A.3: Shadow model
Definition A.4: BQP
Definition A.5: P/poly
...and 27 more

Shadows of quantum machine learning

TL;DR

Abstract

Shadows of quantum machine learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (37)