Table of Contents
Fetching ...

Predictive Performance of Deep Quantum Data Re-uploading Models

Xin Wang, Han-Xiao Tao, Re-Bing Wu

TL;DR

The paper proves a fundamental limit on the predictive performance of deep data re-uploading quantum models with limited qubits: as encoding depth $L$ increases, the expected output over unseen data converges toward the maximally mixed state, causing near-random predictions. By analyzing Pauli-basis coefficients and employing the Petz-Rényi-2 divergence $D_2$, the authors derive a bound $D_2(\mathbb{E}[\rho]\|\rho_I) \le \log_2\bigl(1+(2^N-1)e^{-L\sigma^2}\bigr)$, and show that for $L$ large enough, $|\mathbb{E}_{\boldsymbol{x}}[h_S(\boldsymbol{x})]-h_I|\le \epsilon$, with $h_I=\operatorname{Tr}[H\rho_I]$. They extend these results to arbitrary parameterized gates and to repeated data uploading, proving that increasing depth, not repetitions, governs predictive degradation. Experiments on synthetic and real datasets (e.g., MNIST, CIFAR-10) corroborate the theory, revealing that deep encoding layers on few-qubit circuits yield predictions near random-guessing, while training error can improve with more layers or repetitions but generalization remains poor. The practical implication is clear: for high-dimensional classical data, quantum classifiers should prioritize wider circuit architectures over deeper encodings to retain predictive power.

Abstract

Quantum machine learning models incorporating data re-uploading circuits have garnered significant attention due to their exceptional expressivity and trainability. However, their ability to generate accurate predictions on unseen data, referred to as the predictive performance, remains insufficiently investigated. This study reveals a fundamental limitation in predictive performance when deep encoding layers are employed within the data re-uploading model. Concretely, we theoretically demonstrate that when processing high-dimensional data with limited-qubit data re-uploading models, their predictive performance progressively degenerates to near random-guessing levels as the number of encoding layers increases. In this context, the repeated data uploading cannot mitigate the performance degradation. These findings are validated through experiments on both synthetic linearly separable datasets and real-world datasets. Our results demonstrate that when processing high-dimensional data, the quantum data re-uploading models should be designed with wider circuit architectures rather than deeper and narrower ones.

Predictive Performance of Deep Quantum Data Re-uploading Models

TL;DR

The paper proves a fundamental limit on the predictive performance of deep data re-uploading quantum models with limited qubits: as encoding depth increases, the expected output over unseen data converges toward the maximally mixed state, causing near-random predictions. By analyzing Pauli-basis coefficients and employing the Petz-Rényi-2 divergence , the authors derive a bound , and show that for large enough, , with . They extend these results to arbitrary parameterized gates and to repeated data uploading, proving that increasing depth, not repetitions, governs predictive degradation. Experiments on synthetic and real datasets (e.g., MNIST, CIFAR-10) corroborate the theory, revealing that deep encoding layers on few-qubit circuits yield predictions near random-guessing, while training error can improve with more layers or repetitions but generalization remains poor. The practical implication is clear: for high-dimensional classical data, quantum classifiers should prioritize wider circuit architectures over deeper encodings to retain predictive power.

Abstract

Quantum machine learning models incorporating data re-uploading circuits have garnered significant attention due to their exceptional expressivity and trainability. However, their ability to generate accurate predictions on unseen data, referred to as the predictive performance, remains insufficiently investigated. This study reveals a fundamental limitation in predictive performance when deep encoding layers are employed within the data re-uploading model. Concretely, we theoretically demonstrate that when processing high-dimensional data with limited-qubit data re-uploading models, their predictive performance progressively degenerates to near random-guessing levels as the number of encoding layers increases. In this context, the repeated data uploading cannot mitigate the performance degradation. These findings are validated through experiments on both synthetic linearly separable datasets and real-world datasets. Our results demonstrate that when processing high-dimensional data, the quantum data re-uploading models should be designed with wider circuit architectures rather than deeper and narrower ones.

Paper Structure

This paper contains 44 sections, 25 theorems, 128 equations, 15 figures.

Key Result

Theorem 3.1

Consider an $N$-qubit data re-uploading circuit with $L$ encoding layers and without repetition ($P=1$), which encodes data $\boldsymbol{x} \in \mathbb{R}^{3 NL}$ into the circuit, where each data point follows an independent Gaussian distribution, i.e., $x_{l,n,i} \sim \mathcal{N}(\mu_{l,n,i},\sigm

Figures (15)

  • Figure 1: For $D$-dimensional linearly separable data, data re-uploading encodes the data into quantum circuits and trains the model effectively. However, during prediction, data re-uploading with shallow encoding layers maintains prediction results close to the training outcomes, while deep encoding layers lead to predictions that approach random guessing.
  • Figure 2: Data re-uploading encoding process. (a) The original data. (b) Divide original data into $L$ chunks. (c) Each data chunk is encoded by an encoding layer. The entire data is re-uploaded into the circuit $P$ times, where the parameterized gates in each repetition can be arbitrary and can differ between repetitions.
  • Figure 3: Approximating circuit for data re-uploading circuits. The qubits in approximating circuit are divided into two parts: working qubits and auxiliary qubits. Firstly, we use $3NL(q+3)$ ancillary qubits to encode the binary digits of feature $\boldsymbol{x} \in \mathbb{R}^{3NL}$ into the circuit. Then, we use the data-independent control gate $\mathrm{C}R$ between the working qubit as target qubit and auxiliary qubits as control qubits to approximate the encoding gate in working qubits. Note that the approximating circuit is only used for theoretical analysis.
  • Figure 4: Divergence versus encoding layer $L$. Panels (a, b): varying qubit number $N$ at fixed repetition $P=1$; panels (c, d): varying repetition $P$ at fixed qubit number $N=1$. Pre-training (before training, in panels (a, c)) and post-training (after training, in panels (b, d)) results are compared with theoretical upper bounds. Panels (a, b) share common legends, and panels (c, d) share common legends: colors in panels (a, c) represent different $N$ or $P$, and line styles in panels (b, d) represent conditions (upper-bound, pre-training, post-training). Y-axis is logarithmic.
  • Figure 5: (a) Training error, (c) Test error, (b), (d) Difference between the model's output with respect to the observable $H_0$ on training data and test data compared to $\operatorname{Tr}\left[H \rho_{I}\right]$, respectively. Error bars represent the minimum and maximum values across 10 independent runs with different random seeds, with the central line showing the mean value.
  • ...and 10 more figures

Theorems & Definitions (46)

  • Theorem 3.1
  • Theorem 3.2
  • Proposition 3.3: Informal
  • Definition 2.1: Quantum Trace Distance
  • Definition 2.2: Quantum Fidelity
  • Definition 2.3: Quantum Affinity
  • Definition 2.4: Quantum Petz-Rényi-$\alpha$ Divergence
  • Definition 2.5: Quantum Relative Entropy
  • Lemma 2.6
  • proof
  • ...and 36 more