Table of Contents
Fetching ...

Stochastic noise can be helpful for variational quantum algorithms

Junyu Liu, Frederik Wilde, Antonio Anna Mele, Xin Jin, Liang Jiang, Jens Eisert

TL;DR

Evidence is provided that the saddle-points problem can be naturally avoided in variational quantum algorithms by exploiting the presence of stochasticity, and it is argued that the natural stochasticity of variational algorithms can be beneficial for avoiding strict saddle points.

Abstract

Saddle points constitute a crucial challenge for first-order gradient descent algorithms. In notions of classical machine learning, they are avoided for example by means of stochastic gradient descent methods. In this work, we provide evidence that the saddle points problem can be naturally avoided in variational quantum algorithms by exploiting the presence of stochasticity. We prove convergence guarantees and present practical examples in numerical simulations and on quantum hardware. We argue that the natural stochasticity of variational algorithms can be beneficial for avoiding strict saddle points, i.e., those saddle points with at least one negative Hessian eigenvalue. This insight that some levels of shot noise could help is expected to add a new perspective to notions of near-term variational quantum algorithms.

Stochastic noise can be helpful for variational quantum algorithms

TL;DR

Evidence is provided that the saddle-points problem can be naturally avoided in variational quantum algorithms by exploiting the presence of stochasticity, and it is argued that the natural stochasticity of variational algorithms can be beneficial for avoiding strict saddle points.

Abstract

Saddle points constitute a crucial challenge for first-order gradient descent algorithms. In notions of classical machine learning, they are avoided for example by means of stochastic gradient descent methods. In this work, we provide evidence that the saddle points problem can be naturally avoided in variational quantum algorithms by exploiting the presence of stochasticity. We prove convergence guarantees and present practical examples in numerical simulations and on quantum hardware. We argue that the natural stochasticity of variational algorithms can be beneficial for avoiding strict saddle points, i.e., those saddle points with at least one negative Hessian eigenvalue. This insight that some levels of shot noise could help is expected to add a new perspective to notions of near-term variational quantum algorithms.
Paper Structure (9 sections, 7 theorems, 85 equations, 14 figures)

This paper contains 9 sections, 7 theorems, 85 equations, 14 figures.

Key Result

Theorem 8

Given a $\beta$-strongly smooth function $\mathcal{L}(\cdot)$, for any $\epsilon>0$, if we set the learning rate as $\eta=1 / \beta$, then the number of iterations required by the gradient descent algorithm such that it will visit an $\epsilon$-approximate stationary point is where $\mathbf{\theta}_{0}$ is the initial point and $\mathcal{L}^\star$ is the value of $\mathcal{L}$ computed in the glo

Figures (14)

  • Figure 1: Stochasticity in variational quantum algorithms can help in avoiding (strict) saddle points.
  • Figure 2: Comparison of the loss evolution with or without noise. The noise levels are manually-added Gaussian distributions, and we keep the same initial conditions. (a) Four different values of the standard deviation $r$. (b) Noiseless case and the noisy case with the standard deviation of the noise $r=0.1$.
  • Figure 3: We quantify the performance against the size of the noise $r$ (classical Gaussian noise) by ${1}/({\mathcal{L}-\mathcal{L}_\text{opt}})$.
  • Figure 4: Saddle-point avoidance from quantum noise. We prepare 30 instances starting from the same initial condition. When noise levels are small (a), with purely measurement noise, (b), including device noise, and shot number is 1000), most trajectories cannot jump out of the saddle points. When noise levels are larger (c), with purely measurement noise, (d), including device noise, and shot number is 70), we have a probability to jump towards the global minimum.
  • Figure 5: Comparison of the loss evolution with or without noise with Hydrogen VQE. The noise is manually drawn from Gaussian distributions with the standard deviation $0.2$, and we keep the same initial conditions. We compare the noiseless case, noisy case and the exact solution.
  • ...and 9 more figures

Theorems & Definitions (22)

  • Definition 1: $L$-Lipschitz function
  • Definition 2: $\beta$-strong smoothness
  • Definition 3: Stationary point
  • Definition 4: $\epsilon$-approximate stationary point
  • Definition 5: Local minimum, local maximum, and saddle point
  • Definition 6: $\rho$-Lipschitz Hessian
  • Definition 7: Gradient descent
  • Theorem 8: Gradient descent complexity
  • Definition 9: Strict saddle point
  • Definition 10: Second-order stationary point
  • ...and 12 more