Table of Contents
Fetching ...

Large-scale Lindblad learning from time-series data

Ewout van den Berg, Brad Mitchell, Ken Xuan Wei, Moein Malekakhlagh

TL;DR

This work tackles learning a Lindblad master equation describing noisy quantum operations from time-resolved data generated by repeated circuits. It exploits Ehrenfest’s theorem to cast the problem into a linear least-squares form $A x = b$ with $x=[\alpha;\beta]$, where $H=\sum_k \alpha_k P_k$ and $\mathcal{L}(\rho)= -\frac{i}{\hbar}[H,\rho] + \sum_{ij} \beta_{ij}(P_i\rho P_j^\dagger - \frac{1}{2}\{P_j^\dagger P_i, \rho\})$. Observables' time traces $\langle O(t)\rangle$ are fit as sums of exponentially damped sinusoids, $\langle O(t)\rangle = \sum_j a_j e^{b_j t}\cos(\omega_j t + \varphi_j)$, to obtain derivatives and populate $b$. Optimization minimizes $\frac{1}{2}\|A x - b\|^2$ subject to a positive-semidefinite constraint $B(x)\succeq 0$, ensuring a CPTP Lindbladian, and scales linearly with qubits for local connectivity. Experimentally, they demonstrate learning a full layer of gates on a 156-qubit IBM processor, analyze SPAM/readout errors, and show that a fine-tuning step can further improve agreement with data.

Abstract

In this work, we develop a protocol for learning a time-independent Lindblad model for operations that can be applied repeatedly on a quantum computer. The protocol is highly scalable for models with local interactions and is in principle insensitive to state-preparation errors. At its core, the protocol forms a linear system of equations for the model parameters in terms of a set of observable values and their gradients. The required gradient information is obtained by fitting time-series data with sums of exponentially damped sinusoids and differentiating those curves. We develop a robust curve-fitting procedure that finds the most parsimonious representation of the data up to shot noise. We demonstrate the approach by learning the Lindbladian for a full layer of gates on a 156-qubit superconducting quantum processor, providing the first learning experiment of this kind. We study the effects of state-preparation and measurement errors and limitations on the operations that can be learned. For improved performance under readout errors, we propose an optional fine-tuning strategy that improves the fit between the time-evolved model and the measured data.

Large-scale Lindblad learning from time-series data

TL;DR

This work tackles learning a Lindblad master equation describing noisy quantum operations from time-resolved data generated by repeated circuits. It exploits Ehrenfest’s theorem to cast the problem into a linear least-squares form with , where and . Observables' time traces are fit as sums of exponentially damped sinusoids, , to obtain derivatives and populate . Optimization minimizes subject to a positive-semidefinite constraint , ensuring a CPTP Lindbladian, and scales linearly with qubits for local connectivity. Experimentally, they demonstrate learning a full layer of gates on a 156-qubit IBM processor, analyze SPAM/readout errors, and show that a fine-tuning step can further improve agreement with data.

Abstract

In this work, we develop a protocol for learning a time-independent Lindblad model for operations that can be applied repeatedly on a quantum computer. The protocol is highly scalable for models with local interactions and is in principle insensitive to state-preparation errors. At its core, the protocol forms a linear system of equations for the model parameters in terms of a set of observable values and their gradients. The required gradient information is obtained by fitting time-series data with sums of exponentially damped sinusoids and differentiating those curves. We develop a robust curve-fitting procedure that finds the most parsimonious representation of the data up to shot noise. We demonstrate the approach by learning the Lindbladian for a full layer of gates on a 156-qubit superconducting quantum processor, providing the first learning experiment of this kind. We study the effects of state-preparation and measurement errors and limitations on the operations that can be learned. For improved performance under readout errors, we propose an optional fine-tuning strategy that improves the fit between the time-evolved model and the measured data.

Paper Structure

This paper contains 23 sections, 32 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Overview of the learning protocol: given (a) an operation that we want to learn: $\Lambda=\exp(\tau\mathcal{L})$ with unit evolution time $\tau$, we (b) prepare circuits that apply the operation for various integer depths $k$ flanked by single-qubit gates to implement appropriate state-preparation and measurement basis changes as well as readout twirling; we then (c) measure observable values for different initial states at the different depths and fit the data by a sum of exponentially damped sinusoids. Using the data points and the gradients of the fitted curves we (d) form a system of equations in the (unknown) model parameters $\alpha$ and $\beta$. Optionally, we (e) fine tune the learned model parameters on local patches of qubits, including parameters for state-preparation and measurement errors, such that the evolved model (solid line) better fits the data (markers), compared to the time-evolved initial model (dashed).
  • Figure 2: Results for the 3$\times$3 circuit in inset (c) simulated without (a--c) and with (d--f) state-preparation and measurement (SPAM) errors. (a) The average model error as a function of misfit multiplier parameter $\mu$ in curve fitting for different optimization algorithms and models with 5 initial states and learning depth 30. The optimization suffix indicates that the fitted curves are locally optimized to better fit the data. The average error is taken as the geometric mean of the median absolute coefficient errors for setting with different shot counts. The median model coefficient error as a function of (b) the number of initial states, for learning depth 10 with semidefinite optimization, and (c) the learning depth for 10 initial states. The dotted lines in (b,c) show fits based on the data for 4--20 initial states and learning depths 10--31, respectively. Similar results are obtained when learning with $20\times$ the nominal state-preparation error but no readout error (dashed lines). Plot (d) shows the median coefficient error as a function of the number of learning depths (the maximum learning depth plus one) in the context of SPAM errors. Readout-error mitigation requires estimation of the confusion matrix $M$, which is done using $10^4$ shots unless otherwise noted. Learning of $M$ is done both with and without state-preparation error (SPE). (e) The median model coefficient error combinations of learning depths, numbers of initial states, and shot count multipliers such that total shot count is kept fixed. The curves represent the average performance over all settings with a given learning depth for different base shot counts. (f) Expectation values for several weight-three Pauli observables for the initial state $\vert$- -+1000 - -$\rangle$ time-evolved according to the ideal Lindbladian (thick faint lines) and the learned model based on 20 initial states, a maximum learning depth of 30, and $10^4$ shots both with (dashed) and without SPAM errors (solid).
  • Figure 3: (a) Topology of the 156-qubit superconducting quantum processor ibm_pittsburgh used for the experiments along with the gates used for Lindblad model learning, including two-qubit Rzz and single-qubit Rz gates (blue), Rx gates (red), and Ry gates (green). The grey qubits have a SPAM fidelity below 0.97 and are excluded from learning; all other qubits (indicated by the light blue region) are included in the Lindblad model. (b) time evolution of the learned model (solid) for the initial state $\vert{1r-}\rangle$ on qubits 93, 94, and 95, along with the error-mitigated data points (markers) and the curve fit (dashed). (c) time evolution of the fine-tuned model based on unmitigated data (markers). Distribution of the absolute difference between the ideal and the learned Hamiltonian terms for (d) idle qubits and single-qubit rotations, and (e) two-qubit Rzz gates for the global model (orange), the local model fits (blue) and the fine-tuned local models (purple). (f) crosstalk terms on neighboring qubit pairs that do not share a gate.
  • Figure 4: Plot of (a) $f(t) = 0.8\exp(-0.05t)\cos(0.7t + 0.6)$, (b) absolute difference between $f'(t)$ and the derivative $p'(t)$ of the $d$-th order polynomial fit $p(t)$ from exact data points $f(t)$ at $t=[0,1,\ldots,d]$, evaluated at $t=\alpha d$ for various $\alpha$ values (dots) and connected by lines for reference. (c) The absolute error in the gradient estimate based on fitting a sum of exponentially damped sinusoids for different learning depths and shot counts when sampling the data.
  • Figure 5: (a) 6-qubit and (b) 9-qubit circuits on a hypothetical 3$\times$3 quantum processor. The Hamiltonian corresponding to the gates is scaled such that unit-time evolution is equal to 30% of the overall gate duration for the 6-qubit case and 20% for the 9-qubit case. We assume a uniform gate time of 50ns. (c) ZZ interactions between the qubits, and (d) a summary of the single-qubit properties including Pauli-Z errors, $T_1$ and $T_{2\phi}$ times, and state-preparation and readout errors.
  • ...and 4 more figures