Quantum Shadow Gradient Descent for Variational Quantum Algorithms
Mohsen Heidari, Mobasshir A Naved, Zahra Honjani, Wenbo Xie, Arjun Jacob Grama, Wojciech Szpankowski
TL;DR
The paper addresses the costly gradient estimation in variational quantum algorithms caused by state collapse and measurement incompatibility. It introduces quantum shadow gradient descent (QSGD), a one-shot, shadow-tomography-based method that estimates all gradient components from a single quantum sample per iteration and updates all parameters simultaneously. The authors derive the gradient expression, prove unbiasedness of the shadow-based gradient estimates, and establish convergence bounds that show faster rates than parameter-shift and one-shot coordinate methods under local (k-local) locality, corroborated by numerical experiments on QNN and entangled GHZ datasets. The results significantly reduce the quantum sample complexity of training VQAs, enabling more practical optimization of large, locally structured variational circuits on near-term quantum hardware.
Abstract
Gradient-based optimizers have been proposed for training variational quantum circuits in settings such as quantum neural networks (QNNs). The task of gradient estimation, however, has proven to be challenging, primarily due to distinctive quantum features such as state collapse and measurement incompatibility. Conventional techniques, such as the parameter-shift rule, necessitate several fresh samples in each iteration to estimate the gradient due to the stochastic nature of state measurement. Owing to state collapse from measurement, the inability to reuse samples in subsequent iterations motivates a crucial inquiry into whether fundamentally more efficient approaches to sample utilization exist. In this paper, we affirm the feasibility of such efficiency enhancements through a novel procedure called quantum shadow gradient descent (QSGD), which uses a single sample per iteration to estimate all components of the gradient. Our approach is based on an adaptation of shadow tomography that significantly enhances sample efficiency. Through detailed theoretical analysis, we show that QSGD has a significantly faster convergence rate than existing methods under locality conditions. We present detailed numerical experiments supporting all of our theoretical claims.
