Table of Contents
Fetching ...

How much can we learn from quantum random circuit sampling?

Tudor Manole, Daniel K. Mark, Wenjie Gong, Bingtian Ye, Yury Polyanskiy, Soonwon Choi

TL;DR

The paper advances quantum device benchmarking by modeling RCS outputs as a high-dimensional $k$-component mixture $p(z)=\sum_{i=1}^k c_i\,\pi_i(z)$ and by leveraging side information to learn rich error diagnostics in situ. It develops estimators for three regimes of side information—full, partial, and none—including generalized XEB, collision-based, variational EM, and moment-based methods, and establishes information-theoretic sample-complexity limits with phase transitions. The authors demonstrate time-varying and correlated error learning in synthetic data and apply the framework to public Google RCS data, extracting in-situ error rates that qualitatively track known hardware behavior and complement component-wise calibrations. Collectively, the work provides practical benchmarking protocols for current/future quantum processors and identifies fundamental limits on what can be learned from RCS data, informing both experimental design and theoretical sampling limits. The approach holds promise for scalable, insight-rich characterization of large-scale quantum devices and may extend to beyond-classical circuits and cross-platform benchmarking.

Abstract

Benchmarking quantum devices is a foundational task for the sustained development of quantum technologies. However, accurate in situ characterization of large-scale quantum devices remains a formidable challenge: such systems experience many different sources of errors, and cannot be simulated on classical computers. Here, we introduce new benchmarking methods based on random circuit sampling (RCS), that substantially extend the scope of conventional approaches. Unlike existing benchmarks that report only a single quantity--the circuit fidelity--our framework extracts rich diagnostic information, including spatiotemporal error profiles, correlated and contextual errors, and biased readout errors, without requiring any modifications of the experiment. Furthermore, we develop techniques that achieve this task without classically intractable simulations of the quantum circuit, by leveraging side information, in the form of bitstring samples obtained from reference quantum devices. Our approach is based on advanced high-dimensional statistical modeling of RCS data. We sharply characterize the information-theoretic limits of error estimation, deriving matching upper and lower bounds on the sample complexity across all regimes of side information. We identify surprising phase transitions in learnability as the amount of side information varies. We demonstrate our methods using publicly available RCS data from a state-of-the-art superconducting processor, obtaining in situ characterizations that are qualitatively consistent yet quantitatively distinct from component-level calibrations. Our results establish both practical benchmarking protocols for current and future quantum computers and fundamental information-theoretic limits on how much can be learned from RCS data.

How much can we learn from quantum random circuit sampling?

TL;DR

The paper advances quantum device benchmarking by modeling RCS outputs as a high-dimensional -component mixture and by leveraging side information to learn rich error diagnostics in situ. It develops estimators for three regimes of side information—full, partial, and none—including generalized XEB, collision-based, variational EM, and moment-based methods, and establishes information-theoretic sample-complexity limits with phase transitions. The authors demonstrate time-varying and correlated error learning in synthetic data and apply the framework to public Google RCS data, extracting in-situ error rates that qualitatively track known hardware behavior and complement component-wise calibrations. Collectively, the work provides practical benchmarking protocols for current/future quantum processors and identifies fundamental limits on what can be learned from RCS data, informing both experimental design and theoretical sampling limits. The approach holds promise for scalable, insight-rich characterization of large-scale quantum devices and may extend to beyond-classical circuits and cross-platform benchmarking.

Abstract

Benchmarking quantum devices is a foundational task for the sustained development of quantum technologies. However, accurate in situ characterization of large-scale quantum devices remains a formidable challenge: such systems experience many different sources of errors, and cannot be simulated on classical computers. Here, we introduce new benchmarking methods based on random circuit sampling (RCS), that substantially extend the scope of conventional approaches. Unlike existing benchmarks that report only a single quantity--the circuit fidelity--our framework extracts rich diagnostic information, including spatiotemporal error profiles, correlated and contextual errors, and biased readout errors, without requiring any modifications of the experiment. Furthermore, we develop techniques that achieve this task without classically intractable simulations of the quantum circuit, by leveraging side information, in the form of bitstring samples obtained from reference quantum devices. Our approach is based on advanced high-dimensional statistical modeling of RCS data. We sharply characterize the information-theoretic limits of error estimation, deriving matching upper and lower bounds on the sample complexity across all regimes of side information. We identify surprising phase transitions in learnability as the amount of side information varies. We demonstrate our methods using publicly available RCS data from a state-of-the-art superconducting processor, obtaining in situ characterizations that are qualitatively consistent yet quantitatively distinct from component-level calibrations. Our results establish both practical benchmarking protocols for current and future quantum computers and fundamental information-theoretic limits on how much can be learned from RCS data.

Paper Structure

This paper contains 77 sections, 40 theorems, 369 equations, 8 figures, 2 tables.

Key Result

Theorem 1

Under conditions assm:pt and assm:sample_size, we have where $\rho = \min\{m/d,1\}$.

Figures (8)

  • Figure 1: Overview of this work. We develop methods to learn many noise parameters from random circuit sampling (RCS) bitstring data. Our protocol makes use of side information, which can be the classically computed bitstring distribution, or samples of these distributions obtained from a reference quantum computer. Our analysis includes the special case where no side information is available --- even in this case, error rates can be learned given enough RCS data. Our method allows extracting more information than previously known benchmarking methods: in addition to the state fidelity, we can estimate the error rates for many types of errors, including state preparation errors, correlated multi-qubit errors, contextual errors which depend on previous gates applied, and readout errors. We find a phase diagram that dictate the hardness, and types of information that can be learned from RCS data, as a function of the amount of side information available. The sample complexity in each regime is analyzed.
  • Figure 2: Learning time-dependent error rates. With synthetic data, we demonstrate the use of our protocol to learn about errors that grow over time. (a)(b)(c) We simulate a $N=18$ one-dimensional brickwork random circuit subject to single-site $X,Y,Z$ Pauli errors, whose error rates (per qubit, per layer) grow from $2.5\times 10^{-4}$ in the first layer to $10^{-3}$ in the last layer. In order to simulate utility-scale circuits of different system sizes, we also set circuit depth equal to the system size while reducing the per-layer error rates such that the total fidelity is fixed at $F\approx 0.5$. (a) Upper panel: the ground truth values of the error rates at each spacetime location. Lower panel: estimated error rates with $n=10^7$ RCS samples and perfect side information (i.e., $m=\infty$). (b) By averaging over qubits, only $10^6$ samples are required to learn the increasing rate with high precision. The blue bars indicate the ground truth; red diamonds mark the estimated error rate, and the shaded red line indicates an extracted rate of error growth (a linear fit to the red diamonds). A non-zero linear fit gradient indicates increasing error rates. (c) Model validation between time-dependent and time-independent error models. To ensure that the learned time-dependence is statistically significant, we compare the extracted gradient (vertical dashed line) against the distribution of gradients learned under the null hypothesis of time-independent error rates, obtained via parametric bootstrap (Appendix \ref{['appendix:numerical']}). The histogram of such gradients provides a confidence interval and $p$-values for time-dependent errors: $10^5$ and $10^6$ samples (indicated in green and blue respectively) are sufficient to learn the error rate growth with statistical significance. (d) System-size dependence of the sample complexity for model validation. As the system size increases, although the Hilbert space dimension increases exponentially (dashed line), the required sample size for model validation grows only polynomially with system size (orange circles). This sample complexity is defined as the number of RCS samples required to discriminate between a fixed rate of error growth and no error growth with $5\sigma$ significance. With increasing system size, classical simulation will not be feasible. In addition, we simulate the case of incomplete side-information (purple triangles) where $m=n$, i.e. the number of side-information samples (per error component) is the same as the number of RCS samples. The sample complexity does not differ significantly between the two cases.
  • Figure 3: Reconstructing correlated errors from synthetic data simulated for a $4\times 5$, depth-5 circuit. We consider two types of correlated error: (a) two-qubit $XX$ errors, or (b) multi-qubit $XX\cdots X$ errors along one row or one column to simulate errors induced along a shared control line. In both settings, we also include time-independent single-site Pauli errors at every qubit with a rate $2\times 10^{-3}$, chosen such that the total many-body fidelity is $F\approx 0.5$. (a) We learn the rates (averaged over layers) of two-qubit errors $X_u X_v$ for all pairs of qubits simultaneously and represent them on a 2D plot: specifically the correlated error rates $c_{u.v}-c_{u}c_{v}$, which subtracts the expected two-qubit error rates from independent single-qubit errors on qubits $u$ and $v$. We refer to this difference as the correlated error rates. Upper left half: ground truth: two qubits, highlighted in the inset, experience correlated $XX$ errors at a rate of $10^{-3}$ per layer. Lower right half: extracted error rates from $10^7$ samples correctly identify the correlated pair. (b) We also learn the rates of correlated errors on all the qubits in the same row or column. Blue bars: ground truth where one row and one column (inset) experience correlated errors. Red diamonds: extracted error rates. Again, $10^7$ samples are sufficient to reliably learn about correlated errors.
  • Figure 4: Analysis of experimental RCS data. We apply our methods to study the publicly-available RCS data from Ref. arute2019quantum, results here shown for $N=18$. Using the MLE, we resolve different types of errors in many spacetime locations. We simulate state-preparation, single-qubit (1q) dephasing errors, two-qubit (2q) gate dephasing and flip-flop errors, and single and double readout errors (full details in App. \ref{['app:google']}), for a total of $k=461$ total errors. (a) We summarize the combined contributions of each error type. Quoted values and error bars indicate the sample mean and its standard error over 10 random circuits. Modeled errors account for 68% of the total weight: a remaining 32% weight is fitted to the white noise term representing errors outside our model such as multiple errors, consistent with expectations for this fidelity value (App. \ref{['app:google']}). Note that the rate of 1q dephasing errors we learn here are the rates of errors that can be described as single-qubit $Z_j$ operators: these errors may also arise from two-qubit gates, and hence the proportion of 1q and 2q errors here are comparable, even though we expect them to primarily arise from two-qubit gates. (b) Converting the results of our benchmarking report into a many-body fidelity (App. \ref{['app:converting_rates']}) yields results in close quantitative agreement with the XEB fidelity. (c) Learned error rates show considerable variation among qubits, in a consistent fashion over random circuit realizations. We plot the total rates of the 2q dephasing and flip-flop errors on nearest neighbors, indicated by the color of the red links. We also plot the single-qubit dephasing error rates, indicated by the size and color of each qubit. Qubits are arranged according to their physical layout on the device (borders and unused qubits for the $N=18$ dataset in gray). The magnitude of learned error rates is consistent between system size, random circuit agreement, and their sum over error channels is consistent with Ref. arute2019quantum (main text). (d) Our procedure also yields time-resolved error rates, revealing approximately time-independent errors in the middle of the circuit. $1\rightarrow 0$ readout errors were found to be the largest type of error. Above, we depict the positioning of our modeled errors in the circuit: in the ideal circuit, a single "layer" consists of four gates applied to each qubit, and we insert errors at layers in the circuit. Errors inserted near the start and end of the circuit have unusual properties, and we omit errors in the first and last three layers (gray regions) to avoid additional complications (see App. \ref{['app:converting_rates']}). (e) We explicitly compare our estimated rates (points) of readout errors with those reported in Ref. arute2019quantum (dashed lines). The average rates of readout errors are quantitatively similar, with deviations on certain qubits: these may arise from the fact that only a subset of qubits are simultaneously measured, which may hence experience error rates different from when all qubits are simultaneously read out (Fig. S24 of Ref. arute2019quantum). Error bars indicate standard error over 10 random circuits. (f) Learning correlated readout errors: We estimate the physical error rate $\widehat{\gamma}_{ij}$ of double readout errors on qubits $i$ and $j$, and compare it to the rates of independent errors $\widehat{\gamma}_i,\widehat{\gamma}_j$: the difference $\widehat{\gamma}_{ij} - \widehat{\gamma}_i \widehat{\gamma}_j$ estimates the rates of correlated readout errors. We indicate these correlations with the thickness and colors of lines between all pairs of qubits $i$ and $j$. These correlations can be as large as a $1\%$ rate, although typical values are closer to $0.2\%$. We see correlations between many pairs of qubits, with stronger correlations (surprisingly negative) between nearest neighbors as well as along the diagonals. We summarize these with a polar plot of the root-mean-squared (RMS) correlations averaged along each direction. Note that this is an average over qubit pairs with a given orientation, ignoring their separation, and not simply a sum, which would weight certain directions over others because of the different number of qubit pairs for each orientation.
  • Figure S1: Sample complexity of various estimators in regime A. For each panel, from the lightest to the darkest, $k$ is chosen from $\{46,181,721\}$. For the top (bottom) row, the matrix $\Pi$ is taken from the Dirichlet distribution (simulating random unitary circuits). Inset for XEB-TH: zoom-in of the regime with small sample sizes to highlight the $n^{-1/4}$ scaling. Each data point is obtained by averaging over at least 10 repetitions of simulation.
  • ...and 3 more figures

Theorems & Definitions (45)

  • Theorem 1
  • Theorem 2
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Proposition 4
  • Proposition 5
  • Proposition 6
  • Lemma 1
  • Lemma 2
  • ...and 35 more