Denoising Gradient Descent in Variational Quantum Algorithms

Lars Simon; Holger Eble; Hagen-Henrik Kowalski; Manuel Radons

Denoising Gradient Descent in Variational Quantum Algorithms

Lars Simon, Holger Eble, Hagen-Henrik Kowalski, Manuel Radons

TL;DR

An algorithm for mitigating the adverse effects of noise on gradient descent in variational quantum algorithms is introduced by computing a {\emph{regularized}} local classical approximation to the objective function at every gradient descent step.

Abstract

In this article we introduce an algorithm for mitigating the adverse effects of noise on gradient descent in variational quantum algorithms. This is accomplished by computing a {\emph{regularized}} local classical approximation to the objective function at every gradient descent step. The computational overhead of our algorithm is entirely classical, i.e., the number of circuit evaluations is exactly the same as when carrying out gradient descent using the parameter-shift rules. We empirically demonstrate the advantages offered by our algorithm on randomized parametrized quantum circuits.

Denoising Gradient Descent in Variational Quantum Algorithms

TL;DR

Abstract

Paper Structure (12 sections, 19 equations, 4 figures, 1 algorithm)

This paper contains 12 sections, 19 equations, 4 figures, 1 algorithm.

Introduction
Acknowledgement
The Algorithm
The Setting
Denoised Gradient Descent Algorithm
Experiments
Alignment with Exact Gradient Vector
Measurement Shot Noise
Quantum Hardware Noise
Descent of Objective Function
Discussion
Conclusion and Outlook

Figures (4)

Figure 1: This figure shows the random circuits used in the experiments in Section \ref{['sec:experiments']}. Here, $C_1 , \dots , C_{m+1}$ are randomly sampled just like the individual layers in the quantum volume test quantum_volume. Moreover, $G_1 ,\dots , G_m$ are randomly sampled non-identity Pauli strings and the measurement observable is always $\mathcal{M}=Z^{\otimes n}$. For a more rigorous description of these circuits, see the beginning of Section \ref{['sec:experiments']}.
Figure 2: We investigated the effect that the choice of $\ell$ has on the ability of Algorithm \ref{['algorithm:denoised_gradient_descent']} to mitigate measurement shot noise. To that end, for each $\ell$, we randomly sampled $N=500$ circuits and points in parameter space and computed both a noisy and a denoised gradient (obtained from Algorithm \ref{['algorithm:denoised_gradient_descent']}). Subsequently, both of them were compared to the exact gradient (computed via statevector simulation): For each of the $N=500$ samples we plot a point in the $x$-$y$-plane, whose $x$- and $y$-coordinate are the cosine similarity of the exact gradient to the denoised gradient and to the noisy gradient respectively. Accordingly, points below the diagonal (red) correspond to outcomes where the denoised gradient obtained from Algorithm \ref{['algorithm:denoised_gradient_descent']} outperformed the noisy gradient; points on or above the diagonal (blue) correspond to outcomes where this was not the case. For more details, see Section \ref{['subsubsec:shot_noise']}.
Figure 3: We investigated the ability of Algorithm \ref{['algorithm:denoised_gradient_descent']} to mitigate the effect of quantum hardware noise. To that end, for each quantum backend, we randomly sampled $N=250$ circuits and points in parameter space and computed both a noisy and a denoised gradient (obtained from Algorithm \ref{['algorithm:denoised_gradient_descent']}). Subsequently, both of them were compared to the exact gradient (computed via statevector simulation): For each of the $N=250$ samples we plot a point in the $x$-$y$-plane, whose $x$- and $y$-coordinate are the cosine similarity of the exact gradient to the denoised gradient and to the noisy gradient respectively. Accordingly, points below the diagonal (red) correspond to outcomes where the denoised gradient obtained from Algorithm \ref{['algorithm:denoised_gradient_descent']} outperformed the noisy gradient; points on or above the diagonal (blue) correspond to outcomes where this was not the case. The AerSimulator backend (without noise model) was included in order to demonstrate that the effect of measurement shot noise is negligible in this experiment. For clarity of visual presentation, we only show points for which both $x$- and $y$-coordinate are $\geq 0.6$ (all points which are not shown are red, i.e., correspond to outcomes where the denoised gradient outperformed the noisy gradient). For more details, see Section \ref{['subsubsec:fake_devices']}.
Figure 4: We executed Algorithm \ref{['algorithm:denoised_gradient_descent']} on several (simulated) quantum hardware backends and using several values for the regularization hyperparameter $\lambda$. The circuit and the initial point in parameter space were sampled randomly. For the sake of comparison, we also executed exact gradient descent (using statevector simulation) and noisy gradient descent using the parameter-shift rules. Algorithm \ref{['algorithm:denoised_gradient_descent']} was executed $N=100$ times for each combination of $\lambda$ and quantum hardware backend. Noisy gradient descent was executed $N=100$ times for each quantum hardware backend. The resulting families of $N=100$ curves were averaged respectively; the standard deviation is indicated in some of the plots. For more details, see Section \ref{['subsec:descent_objective']}.

Theorems & Definitions (1)

Remark 1

Denoising Gradient Descent in Variational Quantum Algorithms

TL;DR

Abstract

Denoising Gradient Descent in Variational Quantum Algorithms

Authors

TL;DR

Abstract

Table of Contents

Figures (4)

Theorems & Definitions (1)