Table of Contents
Fetching ...

Quantum Analytic Descent

Bálint Koczor, Simon C. Benjamin

TL;DR

The paper tackles the high measurement cost of variational quantum algorithms by introducing Quantum Analytic Descent, which builds a local classical model of the quantum energy landscape around a reference point using a trigonometric expansion of Pauli-string gates. This surrogate enables a two-loop optimization: a quantum step to fit a local model and a subsequent classical step to descend to a (near) minimum, with a provable measurement-cost bound showing a jump can cost as little as a single gradient evaluation asymptotically. The approach delivers a quadratic-size representation of the energy surface, analytic gradients with polynomial classical complexity, and an optimal-shot-distribution strategy that keeps shot noise under control. Numerical simulations on recompilation and spin-ring problems demonstrate significant reductions in quantum resources and improved convergence, highlighting a practical path toward more efficient near-term quantum optimization. The work also outlines extensions to include metric information and Bayesian priors, and it provides open-source software for broader adoption.

Abstract

Variational algorithms have particular relevance for near-term quantum computers but require non-trivial parameter optimisations. Here we propose Analytic Descent: Given that the energy landscape must have a certain simple form in the local region around any reference point, it can be efficiently approximated in its entirety by a classical model -- we support these observations with rigorous, complexity-theoretic arguments. One can classically analyse this approximate function in order to directly `jump' to the (estimated) minimum, before determining a more refined function if necessary. We derive an optimal measurement strategy and generally prove that the asymptotic resource cost of a `jump' corresponds to only a single gradient vector evaluation.

Quantum Analytic Descent

TL;DR

The paper tackles the high measurement cost of variational quantum algorithms by introducing Quantum Analytic Descent, which builds a local classical model of the quantum energy landscape around a reference point using a trigonometric expansion of Pauli-string gates. This surrogate enables a two-loop optimization: a quantum step to fit a local model and a subsequent classical step to descend to a (near) minimum, with a provable measurement-cost bound showing a jump can cost as little as a single gradient evaluation asymptotically. The approach delivers a quadratic-size representation of the energy surface, analytic gradients with polynomial classical complexity, and an optimal-shot-distribution strategy that keeps shot noise under control. Numerical simulations on recompilation and spin-ring problems demonstrate significant reductions in quantum resources and improved convergence, highlighting a practical path toward more efficient near-term quantum optimization. The work also outlines extensions to include metric information and Bayesian priors, and it provides open-source software for broader adoption.

Abstract

Variational algorithms have particular relevance for near-term quantum computers but require non-trivial parameter optimisations. Here we propose Analytic Descent: Given that the energy landscape must have a certain simple form in the local region around any reference point, it can be efficiently approximated in its entirety by a classical model -- we support these observations with rigorous, complexity-theoretic arguments. One can classically analyse this approximate function in order to directly `jump' to the (estimated) minimum, before determining a more refined function if necessary. We derive an optimal measurement strategy and generally prove that the asymptotic resource cost of a `jump' corresponds to only a single gradient vector evaluation.

Paper Structure

This paper contains 25 sections, 2 theorems, 83 equations, 9 figures.

Key Result

Theorem 1

Let us denote variances of the single-measurement energy estimators as, e.g., $\mathrm{Var}[ E^{(C)}_k]$. In order to determine the full gradient vector to a precision $\epsilon^2:=\sum_{k=1}^\nu \mathrm{Var}[ \partial_m E(\underline{\theta}) ]$, we need to distribute overall $N = T^2 / \epsilon^2$ where, e.g., $N^{(C)}_k$ measurements are used to estimate the coefficient $E^{(C)}_k$. Here $\math

Figures (9)

  • Figure 1: Error of our trigonometric-series approximation of the entire energy surface (a) and gradient vector (b) as a function of the distance $\delta$ from the reference point $\underline{\theta}_0$ of our model, where $\delta=\lVert \underline{\theta} \rVert_\infty$ is the absolute largest parameter $\theta_k$. As long as $\delta$ is small, we can classically approximate the gradient vector and use it in an analytic descent optimisation. The approximation error of the gradient vector is computed as the similarity measure $1-f$, refer to text. We used a 12-qubit spin-ring Hamiltonian as in Fig. \ref{['fig2']}(b) and an 84-parameter ansatz circuit, and included the empirical scaling of the errors as $\mathcal{O}(\delta^{3.1})$ and $\mathcal{O}(\delta^{4.2})$.
  • Figure 2: Distance from the exact ground-state energy (residual energy) as a function of the overall number of measurements (quantum resources). (a) Recompiling a 4-qubit unitary into hardware native gates via an $8$-qubit ground-state search problem and (b) finding the ground state of an $8$-qubit spin-ring Hamiltonian. Analytic descent appears to outperform all other techniques in terms of both convergence rate and the absolute level of quantum resources: its qualitative difference can be attributed to its ability of explicitly keeping track of the evolution via an efficient classical approximation of the energy surface. In particular, a classical approximation of the energy surface is determined at each iteration step of analytic descent (solid lines) and in an internal loop we descent towards its minimum using a classical computer using gradient descent (not shown here). Our approximation is occasionally refined with optimally distributed additional measurements to keep shot noise (via $\epsilon^2$) below a threshold. Note that the hyperparameters have been optimised, especially the sampling rates, for each technique specifically so that the low energy regime can be reached. Consequently they do oversample in the early evolution and a left-to-right shift should be viewed as an artefact of this choice. All four techniques rely on determining the coefficients, such as $E^{(B)}_k$, from Eq. \ref{['full-energy']}.
  • Figure 3: (left) Empirically estimating the precision $\epsilon^2$ (variance) as the expected Euclidean distance from the exact gradient vector $\epsilon^2:=\langle \lVert\Delta g\rVert^2 \rangle = \sum_{k=1}^\nu \mathrm{Var}[ \partial_m E(\underline{\theta}) ]$ for $2000$ randomly selected points in parameter space. This verifies our analytical expression derived in Sec. \ref{['sec:opt_measurement']} that we have numerically exactly computed using our efficient C code gitcode. (right) We compute the exact expression for the function $T(\underline{\theta})$ as defined in Eq. \ref{['tdefinition']} using our efficient C code and compare it to the analytical approximation in \ref{['tapproximation']} and obtain the expected $\mathcal{O}(\delta^2)$ error term. The analytical approximation in \ref{['tapproximation']} is used to derive the scaling of the measurement cost of the analytic descent approach.
  • Figure 4: (Left) One needs to estimate the energy at shifted parameters $\underline{\theta}_m$ in order to determine the coefficients in Eq. \ref{['surfaceapprox']}. We determine the corresponding single-shot variances $\mathrm{Var}[E(\underline{\theta}_m)]$ in case of a 4-qubit spin-ring Hamiltonian in Eq. \ref{['hamil-appendix']} assuming that expectation values of Pauli strings are determined individually by sampling from the quantum computer. The single-shot estimation variance is generally upper bounded via Eq. \ref{['smax']}, but it can be reduced significantly by applying more advanced techniques for simultaneously measuring commuting Pauli strings. Our relative measurement cost depends on the ratio of minimal and maximal variances $S$ via Theorem \ref{['relative_cost']}. In the present example we can estimate $S\leq4.2$ using $S^2_{min} \geq \min_{\theta} \mathrm{Var}[E(\underline{\theta})]$. (Right) For all simulations of analytic descent from Fig. 2 in the main text we plot the exact relative measurement cost $N/N_{grad}$ (which is upper bounded via Theorem \ref{['relative_cost']}) as a function of the classical iterations. In the initial evolutions $\delta$ is relatively large and analytic descent is therefore expensive. However, as we approach the optimum $\delta$ is smaller and the measurement overhead decreases and is guaranteed to vanish asymptotically. Sudden jumps in the plot indicate positions where we abort the classical internal loop and re-determine our classical approximation at the new reference point -- which costs exactly the same as determining the gradient vector. Red corresponds to the recompilation problem while black corresponds to the spin-ring Hamiltonian.
  • Figure 5: Approximating the energy surface $E(\underline{\theta})$ and the gradient vector $\underline{g}(\underline{\theta})$ at randomly generated points $\underline{\theta}$ around the ground state of the spin-ring Hamiltonian from Sec. \ref{['simulations']} using analytic descent and using the Taylor expansion from Eq. \ref{['taylor_expansion']}. Approximation error of the gradient is computed via the vector distance $\lVert \underline{v}-\underline{g} \rVert_{\infty}$. Red line in the diagonal corresponds to the case when the two approaches give the same error. Although both the Taylor expansion and analytic descent have the same asymptotic scaling in $\delta$, analytic descent typically significantly outperforms the Taylor expansion (sometimes by as much as 2 orders of magnitude) for non-vanishing $\delta$---as relevant in practice.
  • ...and 4 more figures

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • proof