The curse of random quantum data

Kaining Zhang; Junyu Liu; Liu Liu; Liang Jiang; Min-Hsiu Hsieh; Dacheng Tao

The curse of random quantum data

Kaining Zhang, Junyu Liu, Liu Liu, Liang Jiang, Min-Hsiu Hsieh, Dacheng Tao

TL;DR

The paper identifies a fundamental curse of random quantum data: when quantum inputs are uniformly random, the learning performance of quantum kernel methods and wide quantum neural networks degrades exponentially with the Hilbert-space dimension. It develops a rigorous framework based on the quantum neural tangent kernel (QNTK) to quantify training dynamics and generalization, establishing bounds that show limited test-improvement for small-to-moderate training sets drawn from Haar-like distributions. It further shows that the QNTK spectrum collapses for random quantum data, but that carefully designed state distributions with biased Pauli-coefficient variance can restore meaningful spectra and enable efficient convergence. Numerical experiments on quantum dynamics learning and binary classification corroborate the theory, illustrating how data design can dramatically affect convergence speed and generalization, and highlighting a practical pathway to achieving robust quantum learning by engineering quantum datasets and encodings.

Abstract

Quantum machine learning, which involves running machine learning algorithms on quantum devices, may be one of the most significant flagship applications for these devices. Unlike its classical counterparts, the role of data in quantum machine learning has not been fully understood. In this work, we quantify the performances of quantum machine learning in the landscape of quantum data. Provided that the encoding of quantum data is sufficiently random, the performance, we find that the training efficiency and generalization capabilities in quantum machine learning will be exponentially suppressed with the increase in the number of qubits, which we call "the curse of random quantum data". Our findings apply to both the quantum kernel method and the large-width limit of quantum neural networks. Conversely, we highlight that through meticulous design of quantum datasets, it is possible to avoid these curses, thereby achieving efficient convergence and robust generalization. Our conclusions are corroborated by extensive numerical simulations.

The curse of random quantum data

TL;DR

Abstract

Paper Structure (27 sections, 23 theorems, 175 equations, 9 figures)

This paper contains 27 sections, 23 theorems, 175 equations, 9 figures.

Introduction
Theoretical results
Background of quantum machine learning
Generalization error with quantum data
Spectrum of QNTK with quantum data
Convergence analysis of training QNNs
Numerical Results
Quantum dynamics learning
Binary classification
Discussion
Additional Numerical Results
Technical Lemmas
Lemmas about unitary $t$-designs
Lemmas about variational quantum algorithms
Lemmas about random matrix theory
...and 12 more sections

Key Result

Theorem 1

Suppose all $N$-qubit quantum states in the training and test datasets $\mathcal{A}$ and $\mathcal{B}$ are independently sampled from state 2-designs and the size of datasets is smaller than $2^{N/2}$. Let $\mathcal{L}_{\mathcal{A}}(t)$ and $\mathcal{L}_{\mathcal{B}}(t)$ be the training and the test where the expectation is taken under state $2$-designs for states in $\mathcal{A}$ and $\mathcal{B}

Figures (9)

Figure 1: Scaling of Pauli decomposition coefficients of finite local-depth circuit states over $S\in \{1,2,3,4\}$ and circuit layer $L\in \{0,1,2,4,8\}$. Figure \ref{['vqacr_fig_Aai_fldcs_mean0']} shows the average of $|\mathbb{E}_{a\in\mathcal{A}} A'_{\bm{i}a} |$ in Eq. (\ref{['vqacr_main_Ttest_mean']}) w.r.t. all $\|\bm{i}\|_1=S$ with 1D connectivity, where $|\mathcal{A}| \in \{10, 20, 50, 100, 20, 500, 1000\}$. The grey dashed line plots $|\mathcal{A}|^{-0.5}$. Figure \ref{['vqacr_fig_Aai_fldcs_alpha']} shows the average of $\alpha_{\bm{i}}$ in Proposition \ref{['vqacr_prop_state_distribution_local']} w.r.t. all $\|\bm{i}\|_1=S$ with 1D connectivity, where $N \in \{5, 6, 7, 8, 9, 10, 11, 12\}$, and each $\alpha_{\bm{i}}$ is calculated by taking the average over $|\mathcal{A}|=1000$ samples. The grey dashed line plots the Haar limit. The error bar in all figures shows the standard deviation over variables with different $\bm{i}$.
Figure 2: Finite local-depth circuits with $L$ blocks for $N=5$.
Figure 3: Numerical results of QDL with FLDC input states, where $(N,D)=(12,240)$ and $L \in \{0,1,2,4,8\}$. Figures \ref{['vqacr_fig_qdl_fldc_LA']} and \ref{['vqacr_fig_qdl_fldc_LB']} show the relative loss of the training and the test dataset during the training, respectively. Figure \ref{['vqacr_fig_qdl_fldc_gradnorm']} shows the $\ell_2$-norm of the gradient for the loss function. Figure \ref{['vqacr_fig_qdl_fldc_lminK']} shows the least eigenvalue of the QNTK. Each solid line denotes the average of $5$ rounds of simulations with independent circuits and parameters.
Figure 4: An assignment of $\mathcal{Q}=\{\mathcal{S}_1,\cdots,\mathcal{S}_{N}\}$ in the hardware-efficient manner for $(N,S)=(9,3)$.
Figure 5: Numerical results of QDL with FLDC input states, where $(N,L)=(12,1)$ and $D \in \{60, 84, 120, 168, 240\}$. The left figure illustrates the $\ell_2$-norm distance in the parameter space from the initial to the final step. The right figure illustrates the largest coefficient in Eq. (\ref{['vqacr_lemma_ntk_local_Jacobian_stability_eq']}) during the training. Each solid line denotes the average of $5$ rounds of simulations with independent circuits and parameters.
...and 4 more figures

Theorems & Definitions (34)

Theorem 1
Theorem 2
Theorem 3
Proposition 1
Lemma 1
Lemma 2
Theorem 4: Linear convergence of QNN training w.r.t. Proposition \ref{['vqacr_prop_state_distribution_local']}
Theorem 5: Linear convergence of QNN training w.r.t. qubit embedding
Lemma S1: from Ref. nc_cerezo2020cost
Lemma S2: from Ref. nc_cerezo2020cost
...and 24 more

The curse of random quantum data

TL;DR

Abstract

The curse of random quantum data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (34)