Explicit quantum surrogates for quantum kernel models

Akimoto Nakayama; Hayata Morisaki; Kosuke Mitarai; Hiroshi Ueda; Keisuke Fujii

Explicit quantum surrogates for quantum kernel models

Akimoto Nakayama, Hayata Morisaki, Kosuke Mitarai, Hiroshi Ueda, Keisuke Fujii

TL;DR

A quantum-classical hybrid algorithm is proposed to create an explicit quantum surrogate (EQS) for trained implicit models that reduces prediction costs, provides a powerful strategy to mitigate barren plateau issues, and combines the strengths of both QML approaches.

Abstract

Quantum machine learning (QML) leverages quantum states for data encoding, with key approaches being explicit models that use parameterized quantum circuits and implicit models that use quantum kernels. Implicit models often have lower training errors but face issues such as overfitting and high prediction costs, while explicit models can struggle with complex training and barren plateaus. We propose a quantum-classical hybrid algorithm to create an explicit quantum surrogate (EQS) for trained implicit models. This involves diagonalizing an observable from the implicit model and constructing a corresponding quantum circuit using an extended automatic quantum circuit encoding algorithm. The EQS framework reduces prediction costs, provides a powerful strategy to mitigate barren plateau issues, and combines the strengths of both QML approaches.

Explicit quantum surrogates for quantum kernel models

TL;DR

Abstract

Paper Structure (34 sections, 6 theorems, 47 equations, 10 figures, 1 table, 2 algorithms)

This paper contains 34 sections, 6 theorems, 47 equations, 10 figures, 1 table, 2 algorithms.

Comparison with alternative methods for prediction acceleration
vs. Linear projected quantum kernels (LPQKs)
vs. Classical surrogate models
On barren plateaus and the EQS mitigation strategy
The barren plateau phenomenon
Mitigation via informed initialization strategies
The EQS approach: combining inductive bias and a warm-start
On the distinction between barren plateaus and kernel concentration
Justification for the learnable regime
Resource cost and scalability analysis
EQS construction cost
Quantum cost (measurement shots)
Classical cost
Prediction cost
Scalability of the circuit construction
...and 19 more sections

Key Result

Proposition J.1

Let $(f,b)$ be the decision function trained by a C-SVM on a sample of size $M$, giving the classifier $h(\boldsymbol{x}) =\operatorname{sgn}\bigl(f(\boldsymbol{x})+b\bigr)$. Let $h_K$ be its rank-$K$ approximation. Assume the kernel $k$ with $\sup_{\boldsymbol{x}\in \mathcal{X}}k(\boldsymbol{x},\bo where $\gamma_{M,\delta}:= 2(1+\sqrt{\log(2/\delta)}) \left(\frac{3\Lambda}{\sqrt{M\mu_{M}}} + \fra

Figures (10)

Figure 1: Overview of the process to convert a trained implicit model to an explicit model (EQS). An explicit model is constructed from a trained implicit model. First, we find the eigenvalues $\lambda_k$ and eigenvectors $|\lambda_k\rangle$ of the observable $O_{\boldsymbol{\alpha}, \mathcal{D}}$ in Eq. \ref{['eq:implicit_model_ob']}. Utilizing our extended AQCE algorithm, a quantum circuit $\mathcal{C}$ is constructed that satisfies the condition $\mathcal{C}|k\rangle\simeq|\lambda_{k}\rangle$ for $K$ eigenvectors $\{|\lambda_{k}\rangle\}_{k=0,...,K-1}$ with the accuracy desired by the user, where $|k\rangle$ is the computational basis. This yields an explicit model $\operatorname{Tr}\left[ \rho^\prime(\boldsymbol{x}) O \right]$, where $\rho^\prime(\boldsymbol{x})=\mathcal{C}^\dagger U(\boldsymbol{x})|\boldsymbol{0}\rangle\langle\boldsymbol{0}| U^{\dagger}(\boldsymbol{x}) \mathcal{C}$ is a density matrix and $O=\sum_{k=0}^{K-1} \lambda_k|k\rangle\langle k|$ is an observable.
Figure 2: Performance of EQS on MNISQ-MNIST and 12-qubit VQE-generated dataset. The vertical axis represents the classification accuracy on the test data. The horizontal axis represents the number of eigenvectors $K$ used in the eigenvalue decomposition of $O_{\boldsymbol{\alpha},\mathcal{D}}$. The EQS refers to Eq. \ref{['eq:eqs']} with fidelities $F^{(k)}>0.6$ for all $k$. The exact low-rank model is obtained by exact low-rank approximations of $O_{\boldsymbol{\alpha}, \mathcal{D}}$, which is equivalent to Eq. \ref{['eq:eqs']} with $F^{(k)}=1.0$ for all $k$. An inset in Fig. \ref{['fig:sim_result']} (b) provides a detailed, magnified view of a specific area depicted in this panel.
Figure 3: Median sum of squared gradients for explicit models with different initializations. For each target label, we compute the sum of squared gradients at the first training step. The horizontal axis indicates the number of qubits $n$. The vertical axis shows the median of these values across the target labels. For $n<16$, all six labels were used; for the $n=16$ point, a subset of four labels was used due to the high simulation cost.
Figure S1: Impact of shot noise on EQS construction cost. The number of two-qubit gates required to construct the circuit for each label of the MNISQ-MNIST dataset, such that the fidelity for each eigenvector satisfies $F^{(k)}>0.6$, under noiseless (red bars) and noisy (blue bars, $10^6$ shots) conditions. The line plot shows the percentage increase in gate count due to shot noise.
Figure S2: Scalability analysis of the EQS circuit depth. (a) Scaling with the number of qubits $n$ for the VQE-generated dataset. The plot shows the result for a target label of 3. (b) Scaling with the number of embedded eigenvectors $K$ for the VQE-generated dataset. (c) Scaling with the number of embedded eigenvectors $K$ for the MNISQ MNIST dataset. In panels (b) and (c), the solid line represents the mean number of two-qubit gates averaged over all target labels, and the shaded area indicates the standard deviation.
...and 5 more figures

Theorems & Definitions (11)

Proposition J.1
Lemma J.2: Function output bound
proof
Lemma J.3: Hinge loss bound
proof
Lemma J.4: Orthogonal projection shrinks the RKHS norm
proof
Proposition J.5: Feasible set is projection‑invariant
proof
Theorem J.6: Uniform deviation of the hinge risk
...and 1 more

Explicit quantum surrogates for quantum kernel models

TL;DR

Abstract

Explicit quantum surrogates for quantum kernel models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (11)