Table of Contents
Fetching ...

Quantum advantage for learning shallow neural networks with natural data distributions

Laura Lewis, Dar Gilboa, Jarrod R. McClean

TL;DR

This work investigates quantum advantages in learning real-valued, Fourier-sparse functions under non-uniform data. It develops a two-stage quantum algorithm that first uses period-finding to recover the inner linear map $w\star$ and then applies gradient methods to learn the outer periodic coefficients $\beta^{\star}$, achieving exponential quantum speedups over classical gradient-based learners for broad distribution families (Gaussian, generalized Gaussian, logistic). The analysis extends Hallgren-type irrational period finding to pseudoperiodic and non-uniform settings via a tailored verification procedure, enabling efficient learning with QSQ access to discretized quantum example states. The results close part of the gap between uniform-distribution advantages and distribution-free impossibility by showing exponential quantum gains for a natural real-valued function class—periodic neurons—across practically relevant distributions. The work also discusses practical considerations for quantum data encoding, potential applicability to real-world models, and several open questions around hardness strengthening and broader distribution classes.

Abstract

The application of quantum computers to machine learning tasks is an exciting potential direction to explore in search of quantum advantage. In the absence of large quantum computers to empirically evaluate performance, theoretical frameworks such as the quantum probably approximately correct (PAC) and quantum statistical query (QSQ) models have been proposed to study quantum algorithms for learning classical functions. Despite numerous works investigating quantum advantage in these models, we nevertheless only understand it at two extremes: either exponential quantum advantages for uniform input distributions or no advantage for potentially adversarial distributions. In this work, we study the gap between these two regimes by designing an efficient quantum algorithm for learning periodic neurons in the QSQ model over a broad range of non-uniform distributions, which includes Gaussian, generalized Gaussian, and logistic distributions. To our knowledge, our work is also the first result in quantum learning theory for classical functions that explicitly considers real-valued functions. Recent advances in classical learning theory prove that learning periodic neurons is hard for any classical gradient-based algorithm, giving us an exponential quantum advantage over such algorithms, which are the standard workhorses of machine learning. Moreover, in some parameter regimes, the problem remains hard for classical statistical query algorithms and even general classical algorithms learning under small amounts of noise.

Quantum advantage for learning shallow neural networks with natural data distributions

TL;DR

This work investigates quantum advantages in learning real-valued, Fourier-sparse functions under non-uniform data. It develops a two-stage quantum algorithm that first uses period-finding to recover the inner linear map and then applies gradient methods to learn the outer periodic coefficients , achieving exponential quantum speedups over classical gradient-based learners for broad distribution families (Gaussian, generalized Gaussian, logistic). The analysis extends Hallgren-type irrational period finding to pseudoperiodic and non-uniform settings via a tailored verification procedure, enabling efficient learning with QSQ access to discretized quantum example states. The results close part of the gap between uniform-distribution advantages and distribution-free impossibility by showing exponential quantum gains for a natural real-valued function class—periodic neurons—across practically relevant distributions. The work also discusses practical considerations for quantum data encoding, potential applicability to real-world models, and several open questions around hardness strengthening and broader distribution classes.

Abstract

The application of quantum computers to machine learning tasks is an exciting potential direction to explore in search of quantum advantage. In the absence of large quantum computers to empirically evaluate performance, theoretical frameworks such as the quantum probably approximately correct (PAC) and quantum statistical query (QSQ) models have been proposed to study quantum algorithms for learning classical functions. Despite numerous works investigating quantum advantage in these models, we nevertheless only understand it at two extremes: either exponential quantum advantages for uniform input distributions or no advantage for potentially adversarial distributions. In this work, we study the gap between these two regimes by designing an efficient quantum algorithm for learning periodic neurons in the QSQ model over a broad range of non-uniform distributions, which includes Gaussian, generalized Gaussian, and logistic distributions. To our knowledge, our work is also the first result in quantum learning theory for classical functions that explicitly considers real-valued functions. Recent advances in classical learning theory prove that learning periodic neurons is hard for any classical gradient-based algorithm, giving us an exponential quantum advantage over such algorithms, which are the standard workhorses of machine learning. Moreover, in some parameter regimes, the problem remains hard for classical statistical query algorithms and even general classical algorithms learning under small amounts of noise.

Paper Structure

This paper contains 24 sections, 53 theorems, 454 equations, 1 figure, 4 algorithms.

Key Result

Theorem 1

Let $g_{w^\star}: \mathbb{R}^d \to [-1,1] \in \mathcal{C}$ be a target function for an unknown vector $w^\star \in \mathbb{R}^d$ with norm $R_w$. Consider an input distribution whose density $\varphi^2$ can be written as a square of a function $\varphi$ and is $\epsilon(r)$-Fourier-concentrated. Let

Figures (1)

  • Figure 1: Overview of results.(a) Target function and input distributions. Given an input vector $x \in \mathbb{R}^d$, we consider learning functions of the form $g_{w^\star}(x) = \cos(x^\intercal w^\star)$, where $w^\star \in \mathbb{R}^d$ is an unknown vector. Our illustration emphasizes their connection with classical deep learning, where they are called cosine neurons. We also consider more general periodic neurons, which one can view as linear combinations of cosine neurons with unknown weights. We consider input distributions such as uniform, Gaussians, and more general distributions which are sufficiently flat, as characterized by technical conditions specified in \ref{['sec:non-unif']}. (b) Classical hardness. We strengthen the arguments of shamir2018distribution to show that classical gradient methods require an exponential number of iterations (i.e., an exponential number of gradient samples) in the dimension of the problem and the norm $R_w$ of $w^\star$ to learn these functions. (c) Quantum algorithm. In contrast, our new quantum algorithm using QSQs is exponentially more efficient with respect to both time and sample complexity.

Theorems & Definitions (113)

  • Theorem 1: A variant of Theorem 4 in shamir2018distribution; Informal
  • Theorem 2: Uniform distribution; Informal Version of \ref{['thm:unif']}
  • Theorem 3: Non-uniform distributions; Informal Version of \ref{['thm:non-unif-guarantee']}
  • Corollary 1: Informal
  • Lemma 1: Discretization; Informal
  • Definition 1: Quantum statistical query access for real functions
  • Definition 2: Pseudoperiodic hallgren2007polynomial
  • Theorem 4: Lemma 3.1 in hallgren2007polynomial
  • proof : Proof Sketch of \ref{['thm:hallgren']}
  • Definition 3: Fourier-concentrated shamir2018distribution
  • ...and 103 more