Table of Contents
Fetching ...

Exponential quantum advantage in processing massive classical data

Haimeng Zhao, Alexander Zlokapa, Hartmut Neven, Ryan Babbush, John Preskill, Jarrod R. McClean, Hsin-Yuan Huang

Abstract

Broadly applicable quantum advantage, particularly in classical data processing and machine learning, has been a fundamental open problem. In this work, we prove that a small quantum computer of polylogarithmic size can perform large-scale classification and dimension reduction on massive classical data by processing samples on the fly, whereas any classical machine achieving the same prediction performance requires exponentially larger size. Furthermore, classical machines that are exponentially larger yet below the required size need superpolynomially more samples and time. We validate these quantum advantages in real-world applications, including single-cell RNA sequencing and movie review sentiment analysis, demonstrating four to six orders of magnitude reduction in size with fewer than 60 logical qubits. These quantum advantages are enabled by quantum oracle sketching, an algorithm for accessing the classical world in quantum superposition using only random classical data samples. Combined with classical shadows, our algorithm circumvents the data loading and readout bottleneck to construct succinct classical models from massive classical data, a task provably impossible for any classical machine that is not exponentially larger than the quantum machine. These quantum advantages persist even when classical machines are granted unlimited time or if BPP=BQP, and rely only on the correctness of quantum mechanics. Together, our results establish machine learning on classical data as a broad and natural domain of quantum advantage and a fundamental test of quantum mechanics at the complexity frontier.

Exponential quantum advantage in processing massive classical data

Abstract

Broadly applicable quantum advantage, particularly in classical data processing and machine learning, has been a fundamental open problem. In this work, we prove that a small quantum computer of polylogarithmic size can perform large-scale classification and dimension reduction on massive classical data by processing samples on the fly, whereas any classical machine achieving the same prediction performance requires exponentially larger size. Furthermore, classical machines that are exponentially larger yet below the required size need superpolynomially more samples and time. We validate these quantum advantages in real-world applications, including single-cell RNA sequencing and movie review sentiment analysis, demonstrating four to six orders of magnitude reduction in size with fewer than 60 logical qubits. These quantum advantages are enabled by quantum oracle sketching, an algorithm for accessing the classical world in quantum superposition using only random classical data samples. Combined with classical shadows, our algorithm circumvents the data loading and readout bottleneck to construct succinct classical models from massive classical data, a task provably impossible for any classical machine that is not exponentially larger than the quantum machine. These quantum advantages persist even when classical machines are granted unlimited time or if BPP=BQP, and rely only on the correctness of quantum mechanics. Together, our results establish machine learning on classical data as a broad and natural domain of quantum advantage and a fundamental test of quantum mechanics at the complexity frontier.

Paper Structure

This paper contains 73 sections, 94 theorems, 693 equations, 10 figures, 3 algorithms.

Key Result

Theorem 1

Using $\tilde{O}(N)$ samples, a quantum computer with $\mathrm{poly}(\log N)$ size can solve the linear system task with dimension $N$, whereas any classical machine with $O(N^{0.99})$ size cannot.

Figures (10)

  • Figure 1: Overview of quantum advantage in processing massive classical data.(a) We prove that a quantum computer can outperform exponentially larger classical machines in a wide range of classical data processing tasks, including solving linear systems, classification, and dimension reduction. (b) Our quantum algorithm enables coherent quantum queries to the noisy and evolving classical world. (c) For various classical data processing tasks with problem size $N$, a $\mathrm{poly}(\log N)$-size quantum machine can succeed in $\tilde{O}(N)$ time using quantum oracle sketching. In contrast, we prove that any classical machine, even with exponentially larger size $O(N^{0.99})$, cannot solve the same task unless given time super-polynomial in $N$. This exponential quantum advantage relies only on the principle of quantum superposition, independent of any computational complexity conjectures.
  • Figure 2: Numerical experiments demonstrating exponential quantum advantage in real-world datasets. We perform binary classification and dimension reduction for (a) sentiment analysis of movie reviews from the Internet Movie Database (IMDb) maas2011learning and (b) single-cell RNA sequencing analysis of peripheral blood mononuclear cells (PBMC) zheng2017massively. We compare four general-purpose algorithms: quantum oracle sketching (orange), quantum algorithms using QRAM (gray), classical sparse-matrix algorithms (gray), and classical streaming algorithms (blue). For each algorithm, we truncate the dimension to filter out a varying number of rare features to plot the trade-off between machine size and performance, with standard error indicated by the shaded region. Machine size is defined as the total consumption of fundamental memory units: logical qubits for quantum and floating-point numbers for classical. Performance is quantified by the $5$-fold cross validation accuracy averaged over random category pairs and explained variance relative to the untruncated baseline.
  • Figure 3: Access the classical world in superposition with quantum oracle sketching.(a) An example of making quantum coherent query to a Boolean function using its classical data $(x, f(x))$ with quantum oracle sketching. Upon receiving each classical sample $(x, f(x))$, we apply a multi-controlled phase gate $\exp(i\theta \ketbra{x}), \theta \propto f(x)$. With $M=\Theta(N/\epsilon)$ samples, the resulting random unitary channel approximates the phase oracle $O: \ket{x}\to (-1)^{f(x)}\ket{x}$ of $f$ to $\epsilon$ error in diamond distance. This allows us to instantiate oracle queries in any quantum query algorithm that extracts the desired property of $f$. (b) Numerical experiments benchmarking the number of samples $M$ needed to approximate various oracle queries to $\epsilon$ operator norm error of the expected unitary, which upper bounds the diamond distance error, as a proxy. We consider oracles of Boolean functions, state preparation unitaries of any vectors, and the sparse matrix element and index oracles of any sparse matrices. We use $N$ to denote the domain size of Boolean functions, the dimension of vectors, and the dimension of square matrices. $N_{\mathrm{nnz}}$ represents the number of non-zero elements in a sparse matrix. The solid lines represent the fitted sample complexity scaling, with fitted parameters and root-mean-squared relative errors (RMS rel. err.) listed.
  • Figure 4: Additional numerical experiments demonstrating exponential quantum advantage in real-world datasets. We perform binary classification and dimension reduction for (a) topic analysis of posts from 20 newsgroups joachims1996probabilistic and (b) chemical compound data for Thrombin binding weston2003feature. We compare quantum oracle sketching (orange) with classical sparse-matrix algorithms (gray), quantum algorithms using QRAM (gray), and classical streaming algorithms (blue). For each algorithm, we truncate the dimension to filter out a varying number of rare features to plot the trade-off between machine size and performance, with standard error indicated by the shaded region. Machine size is defined as the total count of fundamental memory units required: logical qubits for quantum and floating-point numbers for classical. Performance is quantified by the $5$-fold cross validation accuracy averaged over random category pairs and explained variance relative to the untruncated baseline.
  • Figure 5: Overview of the models of data access and computation.(a) Illustration of the tree structure of a hierarchical data generation process with $l$ situation levels, each has time scale $T_1, \ldots, T_l$. Random variables within the same box are IID conditioned on their shared latent situations. (b) The model of classical learning algorithms with size $S$ (i.e., $2^S$ possible configurations) and sample complexity $M$. The computation path when the algorithm is given a sequence of data $z_0, \ldots, z_{M-1}$ is highlighted in blue. (c) The model of quantum learning algorithms with size $S$ and sample complexity $M$. Upon receiving a sequence of data $z_0, \ldots, z_{M-1}$, the algorithm applies a series of quantum channels $C^0_{z_0}, \ldots, C^{M-1}_{z_{M-1}}$ and measures the final state to compute the outcome.
  • ...and 5 more figures

Theorems & Definitions (161)

  • Theorem 1: Solving linear systems; formalized in \ref{['thm:q-adv-linear-sys']}
  • Theorem 2: Solving dynamic linear systems; formalized in \ref{['thm:q-adv-linear-sys-dynamic']}
  • Theorem 3: Classification; formalized in \ref{['thm:q-adv-bin-classify']}
  • Theorem 4: Dynamic classification; formalized in \ref{['thm:q-adv-bin-classify-dynamic']}
  • Theorem 5: Dimension reduction; formalized in \ref{['thm:q-adv-dim-reduc']}
  • Theorem 6: Dynamic dimension reduction; formalized in \ref{['thm:q-adv-dim-reduc-dynamic']}
  • Theorem 7: Space advantage from oracle separation; formalized in \ref{['thm:classical-lower-bound']}
  • Lemma C.1: Sharing situations enhances correlation
  • proof
  • Lemma C.2: Correlation is non-decreasing with correlation depth
  • ...and 151 more