Stochastic noise can be helpful for variational quantum algorithms

Junyu Liu; Frederik Wilde; Antonio Anna Mele; Xin Jin; Liang Jiang; Jens Eisert

Stochastic noise can be helpful for variational quantum algorithms

Junyu Liu, Frederik Wilde, Antonio Anna Mele, Xin Jin, Liang Jiang, Jens Eisert

TL;DR

Evidence is provided that the saddle-points problem can be naturally avoided in variational quantum algorithms by exploiting the presence of stochasticity, and it is argued that the natural stochasticity of variational algorithms can be beneficial for avoiding strict saddle points.

Abstract

Saddle points constitute a crucial challenge for first-order gradient descent algorithms. In notions of classical machine learning, they are avoided for example by means of stochastic gradient descent methods. In this work, we provide evidence that the saddle points problem can be naturally avoided in variational quantum algorithms by exploiting the presence of stochasticity. We prove convergence guarantees and present practical examples in numerical simulations and on quantum hardware. We argue that the natural stochasticity of variational algorithms can be beneficial for avoiding strict saddle points, i.e., those saddle points with at least one negative Hessian eigenvalue. This insight that some levels of shot noise could help is expected to add a new perspective to notions of near-term variational quantum algorithms.

Stochastic noise can be helpful for variational quantum algorithms

TL;DR

Abstract

Paper Structure (9 sections, 7 theorems, 85 equations, 14 figures)

This paper contains 9 sections, 7 theorems, 85 equations, 14 figures.

Strong smoothness and Lipschitz-Hessian property
Discussion on more general noise
Additional numerical results
Analytic heuristics
Brownian motion and the Polya's constant
Guessing $1/\epsilon^2$ by dimensional analysis
Large-width limit
Critical noise from random walks
Phenomenological critical noise

Key Result

Theorem 8

Given a $\beta$-strongly smooth function $\mathcal{L}(\cdot)$, for any $\epsilon>0$, if we set the learning rate as $\eta=1 / \beta$, then the number of iterations required by the gradient descent algorithm such that it will visit an $\epsilon$-approximate stationary point is where $\mathbf{\theta}_{0}$ is the initial point and $\mathcal{L}^\star$ is the value of $\mathcal{L}$ computed in the glo

Figures (14)

Figure 1: Stochasticity in variational quantum algorithms can help in avoiding (strict) saddle points.
Figure 2: Comparison of the loss evolution with or without noise. The noise levels are manually-added Gaussian distributions, and we keep the same initial conditions. (a) Four different values of the standard deviation $r$. (b) Noiseless case and the noisy case with the standard deviation of the noise $r=0.1$.
Figure 3: We quantify the performance against the size of the noise $r$ (classical Gaussian noise) by ${1}/({\mathcal{L}-\mathcal{L}_\text{opt}})$.
Figure 4: Saddle-point avoidance from quantum noise. We prepare 30 instances starting from the same initial condition. When noise levels are small (a), with purely measurement noise, (b), including device noise, and shot number is 1000), most trajectories cannot jump out of the saddle points. When noise levels are larger (c), with purely measurement noise, (d), including device noise, and shot number is 70), we have a probability to jump towards the global minimum.
Figure 5: Comparison of the loss evolution with or without noise with Hydrogen VQE. The noise is manually drawn from Gaussian distributions with the standard deviation $0.2$, and we keep the same initial conditions. We compare the noiseless case, noisy case and the exact solution.
...and 9 more figures

Theorems & Definitions (22)

Definition 1: $L$-Lipschitz function
Definition 2: $\beta$-strong smoothness
Definition 3: Stationary point
Definition 4: $\epsilon$-approximate stationary point
Definition 5: Local minimum, local maximum, and saddle point
Definition 6: $\rho$-Lipschitz Hessian
Definition 7: Gradient descent
Theorem 8: Gradient descent complexity
Definition 9: Strict saddle point
Definition 10: Second-order stationary point
...and 12 more

Stochastic noise can be helpful for variational quantum algorithms

TL;DR

Abstract

Stochastic noise can be helpful for variational quantum algorithms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (22)