Table of Contents
Fetching ...

The curse of random quantum data

Kaining Zhang, Junyu Liu, Liu Liu, Liang Jiang, Min-Hsiu Hsieh, Dacheng Tao

TL;DR

The paper identifies a fundamental curse of random quantum data: when quantum inputs are uniformly random, the learning performance of quantum kernel methods and wide quantum neural networks degrades exponentially with the Hilbert-space dimension. It develops a rigorous framework based on the quantum neural tangent kernel (QNTK) to quantify training dynamics and generalization, establishing bounds that show limited test-improvement for small-to-moderate training sets drawn from Haar-like distributions. It further shows that the QNTK spectrum collapses for random quantum data, but that carefully designed state distributions with biased Pauli-coefficient variance can restore meaningful spectra and enable efficient convergence. Numerical experiments on quantum dynamics learning and binary classification corroborate the theory, illustrating how data design can dramatically affect convergence speed and generalization, and highlighting a practical pathway to achieving robust quantum learning by engineering quantum datasets and encodings.

Abstract

Quantum machine learning, which involves running machine learning algorithms on quantum devices, may be one of the most significant flagship applications for these devices. Unlike its classical counterparts, the role of data in quantum machine learning has not been fully understood. In this work, we quantify the performances of quantum machine learning in the landscape of quantum data. Provided that the encoding of quantum data is sufficiently random, the performance, we find that the training efficiency and generalization capabilities in quantum machine learning will be exponentially suppressed with the increase in the number of qubits, which we call "the curse of random quantum data". Our findings apply to both the quantum kernel method and the large-width limit of quantum neural networks. Conversely, we highlight that through meticulous design of quantum datasets, it is possible to avoid these curses, thereby achieving efficient convergence and robust generalization. Our conclusions are corroborated by extensive numerical simulations.

The curse of random quantum data

TL;DR

The paper identifies a fundamental curse of random quantum data: when quantum inputs are uniformly random, the learning performance of quantum kernel methods and wide quantum neural networks degrades exponentially with the Hilbert-space dimension. It develops a rigorous framework based on the quantum neural tangent kernel (QNTK) to quantify training dynamics and generalization, establishing bounds that show limited test-improvement for small-to-moderate training sets drawn from Haar-like distributions. It further shows that the QNTK spectrum collapses for random quantum data, but that carefully designed state distributions with biased Pauli-coefficient variance can restore meaningful spectra and enable efficient convergence. Numerical experiments on quantum dynamics learning and binary classification corroborate the theory, illustrating how data design can dramatically affect convergence speed and generalization, and highlighting a practical pathway to achieving robust quantum learning by engineering quantum datasets and encodings.

Abstract

Quantum machine learning, which involves running machine learning algorithms on quantum devices, may be one of the most significant flagship applications for these devices. Unlike its classical counterparts, the role of data in quantum machine learning has not been fully understood. In this work, we quantify the performances of quantum machine learning in the landscape of quantum data. Provided that the encoding of quantum data is sufficiently random, the performance, we find that the training efficiency and generalization capabilities in quantum machine learning will be exponentially suppressed with the increase in the number of qubits, which we call "the curse of random quantum data". Our findings apply to both the quantum kernel method and the large-width limit of quantum neural networks. Conversely, we highlight that through meticulous design of quantum datasets, it is possible to avoid these curses, thereby achieving efficient convergence and robust generalization. Our conclusions are corroborated by extensive numerical simulations.
Paper Structure (27 sections, 23 theorems, 175 equations, 9 figures)

This paper contains 27 sections, 23 theorems, 175 equations, 9 figures.

Key Result

Theorem 1

Suppose all $N$-qubit quantum states in the training and test datasets $\mathcal{A}$ and $\mathcal{B}$ are independently sampled from state 2-designs and the size of datasets is smaller than $2^{N/2}$. Let $\mathcal{L}_{\mathcal{A}}(t)$ and $\mathcal{L}_{\mathcal{B}}(t)$ be the training and the test where the expectation is taken under state $2$-designs for states in $\mathcal{A}$ and $\mathcal{B}

Figures (9)

  • Figure 1: Scaling of Pauli decomposition coefficients of finite local-depth circuit states over $S\in \{1,2,3,4\}$ and circuit layer $L\in \{0,1,2,4,8\}$. Figure \ref{['vqacr_fig_Aai_fldcs_mean0']} shows the average of $|\mathbb{E}_{a\in\mathcal{A}} A'_{\bm{i}a} |$ in Eq. (\ref{['vqacr_main_Ttest_mean']}) w.r.t. all $\|\bm{i}\|_1=S$ with 1D connectivity, where $|\mathcal{A}| \in \{10, 20, 50, 100, 20, 500, 1000\}$. The grey dashed line plots $|\mathcal{A}|^{-0.5}$. Figure \ref{['vqacr_fig_Aai_fldcs_alpha']} shows the average of $\alpha_{\bm{i}}$ in Proposition \ref{['vqacr_prop_state_distribution_local']} w.r.t. all $\|\bm{i}\|_1=S$ with 1D connectivity, where $N \in \{5, 6, 7, 8, 9, 10, 11, 12\}$, and each $\alpha_{\bm{i}}$ is calculated by taking the average over $|\mathcal{A}|=1000$ samples. The grey dashed line plots the Haar limit. The error bar in all figures shows the standard deviation over variables with different $\bm{i}$.
  • Figure 2: Finite local-depth circuits with $L$ blocks for $N=5$.
  • Figure 3: Numerical results of QDL with FLDC input states, where $(N,D)=(12,240)$ and $L \in \{0,1,2,4,8\}$. Figures \ref{['vqacr_fig_qdl_fldc_LA']} and \ref{['vqacr_fig_qdl_fldc_LB']} show the relative loss of the training and the test dataset during the training, respectively. Figure \ref{['vqacr_fig_qdl_fldc_gradnorm']} shows the $\ell_2$-norm of the gradient for the loss function. Figure \ref{['vqacr_fig_qdl_fldc_lminK']} shows the least eigenvalue of the QNTK. Each solid line denotes the average of $5$ rounds of simulations with independent circuits and parameters.
  • Figure 4: An assignment of $\mathcal{Q}=\{\mathcal{S}_1,\cdots,\mathcal{S}_{N}\}$ in the hardware-efficient manner for $(N,S)=(9,3)$.
  • Figure 5: Numerical results of QDL with FLDC input states, where $(N,L)=(12,1)$ and $D \in \{60, 84, 120, 168, 240\}$. The left figure illustrates the $\ell_2$-norm distance in the parameter space from the initial to the final step. The right figure illustrates the largest coefficient in Eq. (\ref{['vqacr_lemma_ntk_local_Jacobian_stability_eq']}) during the training. Each solid line denotes the average of $5$ rounds of simulations with independent circuits and parameters.
  • ...and 4 more figures

Theorems & Definitions (34)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Proposition 1
  • Lemma 1
  • Lemma 2
  • Theorem 4: Linear convergence of QNN training w.r.t. Proposition \ref{['vqacr_prop_state_distribution_local']}
  • Theorem 5: Linear convergence of QNN training w.r.t. qubit embedding
  • Lemma S1: from Ref. nc_cerezo2020cost
  • Lemma S2: from Ref. nc_cerezo2020cost
  • ...and 24 more