Table of Contents
Fetching ...

Trade-off between Gradient Measurement Efficiency and Expressivity in Deep Quantum Neural Networks

Koki Chinzei, Shinichiro Yamano, Quoc Hoan Tran, Yasuhiro Endo, Hirotaka Oshima

TL;DR

This work establishes a fundamental trade-off between gradient measurement efficiency and expressivity in deep quantum neural networks, showing that higher expressivity (larger dynamical Lie algebras) necessitates greater gradient measurement costs. The authors formalize gradient measurability via a dynamical Lie algebra graph and derive two key inequalities: $\mathcal{X}_{\rm exp} \leq \frac{4^n}{\mathcal{F}_{\rm eff}} - \mathcal{F}_{\rm eff}$ and $\mathcal{X}_{\rm exp} \geq \mathcal{F}_{\rm eff}$, highlighting a fundamental limit on efficient gradient estimation. To approach this limit, they introduce the stabilizer-logical product ansatz (SLPA), a commuting-block circuit built from stabilizers and logical Pauli operators that saturates the trade-off upper bound by exploiting symmetry; SLPA enables gradient estimation with only $2B$ (or $2B-1$) measurement types per block, greatly reducing sample complexity in practice. Numerical experiments on symmetric-function learning and quantum phase recognition demonstrate that SLPA achieves high accuracy and trainability while dramatically lowering data requirements compared with parameter-shift-based approaches, underscoring the practical impact of symmetry-aware, efficient gradient estimation for variational quantum algorithms.

Abstract

Quantum neural networks (QNNs) require an efficient training algorithm to achieve practical quantum advantages. A promising approach is gradient-based optimization, where gradients are estimated by quantum measurements. However, QNNs currently lack general quantum algorithms for efficiently measuring gradients, which limits their scalability. To elucidate the fundamental limits and potentials of efficient gradient estimation, we rigorously prove a trade-off between gradient measurement efficiency (the mean number of simultaneously measurable gradient components) and expressivity in deep QNNs. This trade-off indicates that more expressive QNNs require higher measurement costs per parameter for gradient estimation, while reducing QNN expressivity to suit a given task can increase gradient measurement efficiency. We further propose a general QNN ansatz called the stabilizer-logical product ansatz (SLPA), which achieves the trade-off upper bound by exploiting the symmetric structure of the quantum circuit. Numerical experiments show that the SLPA drastically reduces the sample complexity needed for training while maintaining accuracy and trainability compared to well-designed circuits based on the parameter-shift method.

Trade-off between Gradient Measurement Efficiency and Expressivity in Deep Quantum Neural Networks

TL;DR

This work establishes a fundamental trade-off between gradient measurement efficiency and expressivity in deep quantum neural networks, showing that higher expressivity (larger dynamical Lie algebras) necessitates greater gradient measurement costs. The authors formalize gradient measurability via a dynamical Lie algebra graph and derive two key inequalities: and , highlighting a fundamental limit on efficient gradient estimation. To approach this limit, they introduce the stabilizer-logical product ansatz (SLPA), a commuting-block circuit built from stabilizers and logical Pauli operators that saturates the trade-off upper bound by exploiting symmetry; SLPA enables gradient estimation with only (or ) measurement types per block, greatly reducing sample complexity in practice. Numerical experiments on symmetric-function learning and quantum phase recognition demonstrate that SLPA achieves high accuracy and trainability while dramatically lowering data requirements compared with parameter-shift-based approaches, underscoring the practical impact of symmetry-aware, efficient gradient estimation for variational quantum algorithms.

Abstract

Quantum neural networks (QNNs) require an efficient training algorithm to achieve practical quantum advantages. A promising approach is gradient-based optimization, where gradients are estimated by quantum measurements. However, QNNs currently lack general quantum algorithms for efficiently measuring gradients, which limits their scalability. To elucidate the fundamental limits and potentials of efficient gradient estimation, we rigorously prove a trade-off between gradient measurement efficiency (the mean number of simultaneously measurable gradient components) and expressivity in deep QNNs. This trade-off indicates that more expressive QNNs require higher measurement costs per parameter for gradient estimation, while reducing QNN expressivity to suit a given task can increase gradient measurement efficiency. We further propose a general QNN ansatz called the stabilizer-logical product ansatz (SLPA), which achieves the trade-off upper bound by exploiting the symmetric structure of the quantum circuit. Numerical experiments show that the SLPA drastically reduces the sample complexity needed for training while maintaining accuracy and trainability compared to well-designed circuits based on the parameter-shift method.

Paper Structure

This paper contains 24 sections, 21 theorems, 143 equations, 11 figures.

Key Result

Theorem 1

In deep QNNs, gradient measurement efficiency and expressivity obey the following inequalities: and

Figures (11)

  • Figure 1: Overview of two main results. (a) A trade-off relation between gradient measurement efficiency and expressivity. Any quantum model can only exist in the blue region. Gradient measurement efficiency, $\mathcal{F}_{\rm eff}$, is defined as the number of simultaneously measurable components in the gradient, and expressivity $\mathcal{X}_{\rm exp}$ is the dimension of the dynamical Lie algebra of the parameterized quantum circuit. The red circles denote the SLPA, where $2^s$ gradient components can be measured simultaneously (i.e., $\mathcal{F}_{\rm eff}=2^s$, $s$ is an integer), reaching the upper bound of the trade-off inequality. (b) The circuit structure of SLPA. The generators of SLPA are constructed by taking the products of stabilizers and logical Pauli operators.
  • Figure 2: Quantum circuits used in numerical experiments. The SLPA is constructed from the symmetric ansatz by taking the products of the stabilizers and the generators of the symmetric ansatz.
  • Figure 3: Gradient measurement efficiency. (a)--(c) Commutators between two gradient operators $\Gamma_j(\bm{\theta})$ and $\Gamma_k(\bm{\theta})$ in the SLPA and symmetric and non-symmetric ansatzes for $n=4$ qubits and $L=48$ parameters. The black and yellow regions represent $[\Gamma_j(\bm{\theta}),\Gamma_k(\bm{\theta})]=0$ and $[\Gamma_j(\bm{\theta}),\Gamma_k(\bm{\theta})]\neq 0$ for random $\bm{\theta}$, respectively. (d) Changes in gradient measurement efficiency when the number of parameters $L$ is varied for $n=4$. Their values are computed by minimizing the number of simultaneously measurable sets of $\Gamma_j(\bm{\theta})$'s for random $\bm{\theta}$. The blue circles, orange squares, and green triangles are the results of SLPA and symmetric and non-symmetric ansatzes, approaching four and one in the limit of $L\to\infty$, respectively. The dashed gray lines represent the DLA dimension of each model, $\text{dim}(\mathfrak{g})$.
  • Figure 4: Changes in losses during training. The horizontal axis is the cumulative number of measurement shots. The blue, orange, and green solid (dashed) lines represent the test (training) losses for the SLPA, symmetric ansatz (SA), and non-symmetric ansatz (NSA), respectively. The shaded areas are the maximum and minimum of the test losses for 20 sets of random initial parameters. The numbers of qubits and parameters are $n=4$ and $L=96$.
  • Figure 5: Summary of Lemmas \ref{['thm: not connected']}--\ref{['thm: Gj=Gk']}. Whether two gradient operators $\Gamma_j$ and $\Gamma_k$ are simultaneously measurable is determined by the structural relations between the corresponding nodes $G_j, G_k\in\mathcal{G}$ and the observable $O$ in the DLA graph.
  • ...and 6 more figures

Theorems & Definitions (38)

  • Theorem 1: Informal
  • Theorem 2: The formal version of Theorem \ref{['thm: main_informal']}
  • Definition 1: DLA-connectivity
  • Definition 2: Separability
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • Definition 3: Path and distance
  • ...and 28 more