Table of Contents
Fetching ...

Potential and limitations of random Fourier features for dequantizing quantum machine learning

Ryan Sweke, Erik Recio-Armengol, Sofiene Jerbi, Elies Gil-Fuster, Bryce Fuller, Jens Eisert, Johannes Jakob Meyer

TL;DR

This work investigates the feasibility of dequantizing variational QML training via random Fourier features. It derives necessary and sufficient conditions on data-encoding architectures, kernel design, and RFF sampling to ensure that RFF-based linear regression can match the true-risk performance of PQC optimization, while highlighting fundamental limits when these conditions fail. By introducing re-weighted PQC kernels, analyzing the kernel integral operator and RKHS norms, and establishing upper and lower bounds on data and feature requirements, the authors provide concrete guidance for PQC architecture design to either enable or resist dequantization. The results illuminate the potential quantum advantage landscape, offering a principled framework to assess when PQC training can be efficiently replicated classically and how to tailor encodings to affect frequency concentration and alignment.

Abstract

Quantum machine learning is arguably one of the most explored applications of near-term quantum devices. Much focus has been put on notions of variational quantum machine learning where parameterized quantum circuits (PQCs) are used as learning models. These PQC models have a rich structure which suggests that they might be amenable to efficient dequantization via random Fourier features (RFF). In this work, we establish necessary and sufficient conditions under which RFF does indeed provide an efficient dequantization of variational quantum machine learning for regression. We build on these insights to make concrete suggestions for PQC architecture design, and to identify structures which are necessary for a regression problem to admit a potential quantum advantage via PQC based optimization.

Potential and limitations of random Fourier features for dequantizing quantum machine learning

TL;DR

This work investigates the feasibility of dequantizing variational QML training via random Fourier features. It derives necessary and sufficient conditions on data-encoding architectures, kernel design, and RFF sampling to ensure that RFF-based linear regression can match the true-risk performance of PQC optimization, while highlighting fundamental limits when these conditions fail. By introducing re-weighted PQC kernels, analyzing the kernel integral operator and RKHS norms, and establishing upper and lower bounds on data and feature requirements, the authors provide concrete guidance for PQC architecture design to either enable or resist dequantization. The results illuminate the potential quantum advantage landscape, offering a principled framework to assess when PQC training can be efficiently replicated classically and how to tailor encodings to affect frequency concentration and alignment.

Abstract

Quantum machine learning is arguably one of the most explored applications of near-term quantum devices. Much focus has been put on notions of variational quantum machine learning where parameterized quantum circuits (PQCs) are used as learning models. These PQC models have a rich structure which suggests that they might be amenable to efficient dequantization via random Fourier features (RFF). In this work, we establish necessary and sufficient conditions under which RFF does indeed provide an efficient dequantization of variational quantum machine learning for regression. We build on these insights to make concrete suggestions for PQC architecture design, and to identify structures which are necessary for a regression problem to admit a potential quantum advantage via PQC based optimization.
Paper Structure (22 sections, 7 theorems, 125 equations, 6 figures)

This paper contains 22 sections, 7 theorems, 125 equations, 6 figures.

Key Result

Theorem 1

Let $R$ be the risk associated with a regression problem $P\sim\mathcal{X}\times\mathbb{R}$. Assume the following: Additionally, define Then, let $\delta\in (0,1]$, $\epsilon> 0$, $n\geq n_0$, set $\lambda_n = 1/\sqrt{n}$, and let $\hat{f}_{M_n,\lambda_n}$ be the output of $\lambda_n$-regularized linear regression with respect to the feature map constructed from the integral representation of $

Figures (6)

  • Figure 1: Illustration of the relationship between $\mathcal{F}_\mathcal{D}$ and $\mathcal{F}_{(\Theta,\mathcal{D},O)}$, and the output of Algorithms \ref{['alg:VQML']} and \ref{['alg:LR']}. In particular, we always have that $\mathcal{F}_{(\Theta,\mathcal{D},O)}\subset \mathcal{F}_\mathcal{D}$. As a consequence of this, and the fact that Algorithm \ref{['alg:LR']} is a perfect empirical risk minimizer, we always have that $\hat{R}(f_v)\leq \hat{R}(f_\theta)$. However, it might be the case that $R(f_v)\geq R(f_\theta)$.
  • Figure 2: A graphical illustration of Observation \ref{['obs:prob_decay']}. In particular, we see that if the re-weighting probability distribution $p_{(\mathcal{D},\mathrm{w})}$ is sufficiently concentrated, then $c_0$ will scale polynomially in $d$, which implies via Theorem \ref{['thm:efficiency_RFF']} that polynomially many frequency samples $M$ are sufficient to achieve the desired guarantee from RFF-based linear regression.
  • Figure 3: A graphical illustration of the conditions necessary for the RKHS norm of a function to scale polynomially in $d$. At a high level, we see that for a function $f_v(\cdot) = \langle v,\phi_{(\mathcal{D},\mathrm{w})}(\cdot)\rangle$, the hyperplane vector $v$ (equivalently frequency spectrum of $f$), needs to be sufficiently well-aligned with the re-weighting vector $\mathrm{w}$, which determines the kernel $K_{(\mathcal{D},\mathrm{w})}$ with respect to which the RKHS norm is taken.
  • Figure 4: A graphical illustration of the conditions sufficient for RFF-based linear regression with re-weighting $\mathrm{w}$ to provide an efficient dequantization of PQC regression via circuit architecture $(\Theta,\mathcal{D},O)$, for regression problem $P$. One requires that the optimal PQC function $f^*$ for $P$ is both sufficiently concentrated and sufficiently well-aligned with the re-weighting distribution $p_{(\mathcal{D},\mathrm{w})}$.
  • Figure 5: A methodology for determining whether, and via which re-weighting, linear regression via RFF can provide an efficient means for dequantizing PQC regression over circuit architecture $(\Theta,\mathcal{D},O)$, for regression problem $P$?
  • ...and 1 more figures

Theorems & Definitions (19)

  • Definition 1: PQC feature map and PQC-kernel
  • Definition 2: RKHS and RKHS norm
  • Definition 3: Kernel integral operator
  • Definition 4: Optimal PQC function
  • Theorem 1: RFF vs. variational QML
  • proof
  • Lemma 1: Operator norm of $T_{(K_\mathcal{D},\mathrm{w})}$
  • Lemma 2: Alternative definition of RKHS norm -- Adapted from Theorem 4.21 in Ref. SVMbook
  • Example 1
  • Example 2
  • ...and 9 more