Table of Contents
Fetching ...

Quantum Shadow Gradient Descent for Variational Quantum Algorithms

Mohsen Heidari, Mobasshir A Naved, Zahra Honjani, Wenbo Xie, Arjun Jacob Grama, Wojciech Szpankowski

TL;DR

The paper addresses the costly gradient estimation in variational quantum algorithms caused by state collapse and measurement incompatibility. It introduces quantum shadow gradient descent (QSGD), a one-shot, shadow-tomography-based method that estimates all gradient components from a single quantum sample per iteration and updates all parameters simultaneously. The authors derive the gradient expression, prove unbiasedness of the shadow-based gradient estimates, and establish convergence bounds that show faster rates than parameter-shift and one-shot coordinate methods under local (k-local) locality, corroborated by numerical experiments on QNN and entangled GHZ datasets. The results significantly reduce the quantum sample complexity of training VQAs, enabling more practical optimization of large, locally structured variational circuits on near-term quantum hardware.

Abstract

Gradient-based optimizers have been proposed for training variational quantum circuits in settings such as quantum neural networks (QNNs). The task of gradient estimation, however, has proven to be challenging, primarily due to distinctive quantum features such as state collapse and measurement incompatibility. Conventional techniques, such as the parameter-shift rule, necessitate several fresh samples in each iteration to estimate the gradient due to the stochastic nature of state measurement. Owing to state collapse from measurement, the inability to reuse samples in subsequent iterations motivates a crucial inquiry into whether fundamentally more efficient approaches to sample utilization exist. In this paper, we affirm the feasibility of such efficiency enhancements through a novel procedure called quantum shadow gradient descent (QSGD), which uses a single sample per iteration to estimate all components of the gradient. Our approach is based on an adaptation of shadow tomography that significantly enhances sample efficiency. Through detailed theoretical analysis, we show that QSGD has a significantly faster convergence rate than existing methods under locality conditions. We present detailed numerical experiments supporting all of our theoretical claims.

Quantum Shadow Gradient Descent for Variational Quantum Algorithms

TL;DR

The paper addresses the costly gradient estimation in variational quantum algorithms caused by state collapse and measurement incompatibility. It introduces quantum shadow gradient descent (QSGD), a one-shot, shadow-tomography-based method that estimates all gradient components from a single quantum sample per iteration and updates all parameters simultaneously. The authors derive the gradient expression, prove unbiasedness of the shadow-based gradient estimates, and establish convergence bounds that show faster rates than parameter-shift and one-shot coordinate methods under local (k-local) locality, corroborated by numerical experiments on QNN and entangled GHZ datasets. The results significantly reduce the quantum sample complexity of training VQAs, enabling more practical optimization of large, locally structured variational circuits on near-term quantum hardware.

Abstract

Gradient-based optimizers have been proposed for training variational quantum circuits in settings such as quantum neural networks (QNNs). The task of gradient estimation, however, has proven to be challenging, primarily due to distinctive quantum features such as state collapse and measurement incompatibility. Conventional techniques, such as the parameter-shift rule, necessitate several fresh samples in each iteration to estimate the gradient due to the stochastic nature of state measurement. Owing to state collapse from measurement, the inability to reuse samples in subsequent iterations motivates a crucial inquiry into whether fundamentally more efficient approaches to sample utilization exist. In this paper, we affirm the feasibility of such efficiency enhancements through a novel procedure called quantum shadow gradient descent (QSGD), which uses a single sample per iteration to estimate all components of the gradient. Our approach is based on an adaptation of shadow tomography that significantly enhances sample efficiency. Through detailed theoretical analysis, we show that QSGD has a significantly faster convergence rate than existing methods under locality conditions. We present detailed numerical experiments supporting all of our theoretical claims.
Paper Structure (20 sections, 12 theorems, 48 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 20 sections, 12 theorems, 48 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

Let ${\rho}_l^{out} =U_{\leq l}\ketbra{\phi} U_{\leq l}^\dagger$ denote the density operator of the output state at layer $l$ when the input is $\ket{\phi}$ with label $y$. Then , the derivative of the loss is given by: where $[\cdot ,{}{}{} , \cdot]$ is the commutator operation.

Figures (3)

  • Figure 1: Circuit for measuring the partial derivative with respect to a parameter $a_{\mathbf{s}_l}$ appearing at layer $l$. Here $U_{\leq l}$ corresponds to the first $l$ layers of the ansatz , and $U_{>l}$ to the remaining layers. Here , $X$ is the X-gate and $R_{\mathbf{s}_l}$ is the controlled rotation around Pauli $\sigma^{\mathbf{s}_l}$.
  • Figure 2: The QNN setup for Exp. 1 and 2.
  • Figure 3: Comparing the training accuracy in three experiments based on four methods: QSGD (this work) , RQSGD , and the parameter shift rule gradient computation. The vertical axis is the training accuracy and the horizontal axis is the training samples. The cyan dashed line is the theoretical upper limit on the accuracy. All results were obtained using an optimized learning rate for each method by adopting a grid search technique in different (three) experimental settings.

Theorems & Definitions (14)

  • Lemma 1
  • Lemma 2
  • Theorem 3
  • Theorem 4
  • Lemma \ref{lem:loss derivative}
  • Lemma \ref{lem:gradient unbiased}
  • Theorem \ref{thm:shadow unbiased}
  • Remark 5
  • Theorem 6: Bottou2018
  • Corollary 7
  • ...and 4 more