Table of Contents
Fetching ...

Bayesian Parameter Shift Rule in Variational Quantum Eigensolvers

Samuele Pedrielli, Christopher J. Anders, Lena Funcke, Karl Jansen, Kim A. Nicoli, Shinichi Nakajima

TL;DR

This work introduces Bayesian PSR, a GP-based framework to estimate VQE gradients with uncertainty, enabling flexible observation locations and reuse of past data. By coupling a physics-informed VQE kernel with derivative GP regression, the method provides analytic gradient estimates and posterior uncertainty, which are exploited in GradCoRe to adaptively allocate quantum-shot resources. The authors develop Bayes-SGD and GradCoRe, deriving theoretical connections to classical PSRs and demonstrating that the approach accelerates SGD and outperforms state-of-the-art NFT-based and BO-based strategies on Ising-like Hamiltonians. The results suggest that incorporating uncertainty-aware Bayesian surrogates can substantially reduce quantum hardware costs while preserving optimization accuracy, with implications for scalable VQEs on NISQ devices.

Abstract

Parameter shift rules (PSRs) are key techniques for efficient gradient estimation in variational quantum eigensolvers (VQEs). In this paper, we propose its Bayesian variant, where Gaussian processes with appropriate kernels are used to estimate the gradient of the VQE objective. Our Bayesian PSR offers flexible gradient estimation from observations at arbitrary locations with uncertainty information and reduces to the generalized PSR in special cases. In stochastic gradient descent (SGD), the flexibility of Bayesian PSR allows the reuse of observations in previous steps, which accelerates the optimization process. Furthermore, the accessibility to the posterior uncertainty, along with our proposed notion of gradient confident region (GradCoRe), enables us to minimize the observation costs in each SGD step. Our numerical experiments show that the VQE optimization with Bayesian PSR and GradCoRe significantly accelerates SGD and outperforms the state-of-the-art methods, including sequential minimal optimization.

Bayesian Parameter Shift Rule in Variational Quantum Eigensolvers

TL;DR

This work introduces Bayesian PSR, a GP-based framework to estimate VQE gradients with uncertainty, enabling flexible observation locations and reuse of past data. By coupling a physics-informed VQE kernel with derivative GP regression, the method provides analytic gradient estimates and posterior uncertainty, which are exploited in GradCoRe to adaptively allocate quantum-shot resources. The authors develop Bayes-SGD and GradCoRe, deriving theoretical connections to classical PSRs and demonstrating that the approach accelerates SGD and outperforms state-of-the-art NFT-based and BO-based strategies on Ising-like Hamiltonians. The results suggest that incorporating uncertainty-aware Bayesian surrogates can substantially reduce quantum hardware costs while preserving optimization accuracy, with implications for scalable VQEs on NISQ devices.

Abstract

Parameter shift rules (PSRs) are key techniques for efficient gradient estimation in variational quantum eigensolvers (VQEs). In this paper, we propose its Bayesian variant, where Gaussian processes with appropriate kernels are used to estimate the gradient of the VQE objective. Our Bayesian PSR offers flexible gradient estimation from observations at arbitrary locations with uncertainty information and reduces to the generalized PSR in special cases. In stochastic gradient descent (SGD), the flexibility of Bayesian PSR allows the reuse of observations in previous steps, which accelerates the optimization process. Furthermore, the accessibility to the posterior uncertainty, along with our proposed notion of gradient confident region (GradCoRe), enables us to minimize the observation costs in each SGD step. Our numerical experiments show that the VQE optimization with Bayesian PSR and GradCoRe significantly accelerates SGD and outperforms the state-of-the-art methods, including sequential minimal optimization.

Paper Structure

This paper contains 32 sections, 5 theorems, 62 equations, 8 figures, 2 tables, 2 algorithms.

Key Result

Theorem 3.1

For any $x' \in [0, 2\pi)^D$ and $d = 1, \ldots, D$, the mean and variance of the derivative GP prediction, given observations $\boldsymbol{y} = (y_0, \ldots, y_{2V_d -1})^{\top} \in \mathbb{R}^{2 V_d}$ at $2V_d$ equidistant training points $\boldsymbol{X} = (\boldsymbol{x}_0, \ldots, \boldsymbol{x}

Figures (8)

  • Figure 1: Illustration of our gradient confident region (GradCoRe) approach. Our goal is to minimize the true energy $f^*(\boldsymbol{x})$ over the set of parameters $\boldsymbol{x} \in [0, 2 \pi)^D$, where we use a GP surrogate $f(\boldsymbol{x})$ for approximating $f^*(\boldsymbol{x})$. Observing $f^*$ at points $\boldsymbol{x}_{-}$ and $\boldsymbol{x}_{+}$ (green circles) along the $d$-th direction (solid horizontal line) decreases the uncertainty (dashed curves) not only for predicting $f(\boldsymbol{x}_{\pm})$, but also for predicting $\partial_d f(\widehat{\boldsymbol{x}}^{t-1})$, so that the current optimal point $\widehat{\boldsymbol{x}}^{t-1}$ falls within the GradCoRe (blue square). Our GradCoRe-based SGD uses the minimum number of measurement shots for achieving required gradient estimation accuracy in each iteration, and thus minimizes the total observation costs over the optimization process.
  • Figure 2: Illustration of the behavior of the Bayesian PSR when $V_d=1$ (left) and when $V_d=2$ (middle). Bayesian PSR prediction (red) coincides with general PSR (green cross) for the designed equidistant observations (magenta crosses). The right plot visualizes the variance \ref{['eq:DGPPredictionVar']} of the derivative GP prediction at $\boldsymbol{x}'$, as a function of the shift $\alpha$ of observations when $V_d=1$. Although the optimum is at $\alpha = \frac{\pi}{2}$, the dependence is weak. For all panels, the noise and kernel parameters are set to $\sigma^2 = 0.01, \gamma^2 = 9, \sigma_0^2 = 100$.
  • Figure 3: Comparison between SGD with PSR (dashed curves) and SGD with Bayesian PSR (solid curves), as well as GradCoRe (red solid curve), on the Ising Hamiltonian with an $(L=3)$-layered $(Q=5)$-qubits quantum circuit. The energy (left) and fidelity (right) are plotted as functions of the cumulative $N_{\mathrm{shots}}$, i.e., the total number of measurement shots. Except GradCoRe equipped with the adaptive shots strategy, the number of shots per observation is set to $N_{\mathrm{shots}} = 128$ (blue), $256$ (green), $512$ (orange), and $1024$ (purple).
  • Figure 4: Energy (left) and fidelity (right) achieved within the cumulative number of measurement shots for the Ising Hamiltonian with an $(L=3)$-layered $(Q=5)$-qubits quantum circuit. The curves correspond to SGLBO (blue), Bayes-NFT (green), EMICoRe (orange), SubsCoRe (purple), and our proposed GradCoRe (red).
  • Figure 5: Gradient estimation error by PSR (dashed curve) and Bayesian PSR (solid curve) for $N_{\mathrm{shots}} = 1024$, evaluated by the L2-distance between the estimated gradient $\widetilde{\boldsymbol{\mu}}(\widehat{\boldsymbol{x}})$ and the true gradient $\boldsymbol{g}^*(\widehat{\boldsymbol{x}})$ (computed by the PSR with simulated noiseless measurements).
  • ...and 3 more figures

Theorems & Definitions (5)

  • Theorem 3.1
  • Theorem 3.2
  • Theorem 3.1
  • Corollary 3.2
  • Corollary 3.3