Table of Contents
Fetching ...

Estimating Detector Error Models on Google's Willow

Kregg Elliot Arms, Martin James McHugh, Joseph Edward Nyhan, William Frederick Reus, James Loudon Ulrich

TL;DR

The paper develops decoder-free methods to learn Detector Error Models (DEMs) from quantum-error-correcting syndrome data, formalizing DEMs as binary-incidence structures with matrices $\mathbf{M}$ and rates $\boldsymbol{\theta}$. It introduces two learning tasks—rate estimation and structure learning—implemented via moment-based and parity-based algorithms, and derives analytic syndrome likelihoods to compare DEMs with observed data. Applied to simulated circuits and Google's Willow chips, the work shows parity-based methods are substantially faster for small hyperedge sizes and that learned DEMs can reveal long-range correlations and anomalies, while RL priors improve decoder performance. It also demonstrates online noise characterization via time-varying DEMs and identifies practical artifacts, such as correlated measurement errors, high-energy events, and TLS-like events, that challenge simple DEMs but can be exposed through structured learning and data pooling.

Abstract

We consolidate recent theoretical advances in Detector Error Model (DEM) estimation and formalize several algorithms to learn DEM parameters and structure from syndromes without using a decoder, demonstrating recovery of known DEMs from simulated syndromes with precision limited only by finite-sample effects. We then apply these algorithms to estimate DEMs from Google's 72- and 105-qubit chips. Using a likelihood function that is tractable for small DEMs, we show that DEMs estimated directly from syndromes agree more closely with unseen syndromes than DEMs trained to optimize logical performance, whereas the latter outperform the former as priors for decoders in logical memory experiments. We used a time-series of estimated DEMs to track both global error and specific local errors over the course of a QEC experiment, suggesting applications in online characterization. We employ a sequence of DEM estimation techniques to discover and quantify long-range detector correlations spanning the width of the 105-qubit chip, for which DEM analysis suggests correlated measurement errors rather than high-weight Pauli errors as the most likely explanation. Finally, we present two artifacts in repetition code syndromes that are \emph{not} well-modeled by a DEM: correlated flipping of pairs of adjacent detectors in many consecutive rounds of QEC, and signatures consistent with radiation events occurring more frequently than previously reported.

Estimating Detector Error Models on Google's Willow

TL;DR

The paper develops decoder-free methods to learn Detector Error Models (DEMs) from quantum-error-correcting syndrome data, formalizing DEMs as binary-incidence structures with matrices and rates . It introduces two learning tasks—rate estimation and structure learning—implemented via moment-based and parity-based algorithms, and derives analytic syndrome likelihoods to compare DEMs with observed data. Applied to simulated circuits and Google's Willow chips, the work shows parity-based methods are substantially faster for small hyperedge sizes and that learned DEMs can reveal long-range correlations and anomalies, while RL priors improve decoder performance. It also demonstrates online noise characterization via time-varying DEMs and identifies practical artifacts, such as correlated measurement errors, high-energy events, and TLS-like events, that challenge simple DEMs but can be exposed through structured learning and data pooling.

Abstract

We consolidate recent theoretical advances in Detector Error Model (DEM) estimation and formalize several algorithms to learn DEM parameters and structure from syndromes without using a decoder, demonstrating recovery of known DEMs from simulated syndromes with precision limited only by finite-sample effects. We then apply these algorithms to estimate DEMs from Google's 72- and 105-qubit chips. Using a likelihood function that is tractable for small DEMs, we show that DEMs estimated directly from syndromes agree more closely with unseen syndromes than DEMs trained to optimize logical performance, whereas the latter outperform the former as priors for decoders in logical memory experiments. We used a time-series of estimated DEMs to track both global error and specific local errors over the course of a QEC experiment, suggesting applications in online characterization. We employ a sequence of DEM estimation techniques to discover and quantify long-range detector correlations spanning the width of the 105-qubit chip, for which DEM analysis suggests correlated measurement errors rather than high-weight Pauli errors as the most likely explanation. Finally, we present two artifacts in repetition code syndromes that are \emph{not} well-modeled by a DEM: correlated flipping of pairs of adjacent detectors in many consecutive rounds of QEC, and signatures consistent with radiation events occurring more frequently than previously reported.

Paper Structure

This paper contains 27 sections, 6 theorems, 71 equations, 14 figures, 5 tables, 5 algorithms.

Key Result

Theorem 1

If eq:theorem-1-hypothesis-sums is true for all $A \subseteq [n]$, then for any $S \subseteq [n]$

Figures (14)

  • Figure 1: Residual errors from fitted and true values. The true parameters correspond to the SI1000 DEM for $d$ rounds from the temporal bulk of a $d=7$ surface code (top) or a $d=29$ repetition code (bottom). The SI1000 noise model was used in stim to produce $10^6$ syndromes. Then the parameters were estimated with the algorithms in the previous section. In all plots, blue corresponds to the moment-based Algorithm \ref{['alg:estimate-parameters-moments']} and orange represents the parameters estimated using the parity-based Algorithm \ref{['alg:estimate-parameters-parities']}. Histograms on the left show raw differences between estimated and true DEM parameters. Histograms on the right show normalized differences, wherein each error term has been divided by the approximation of standard error in \ref{['eq:moment-std-estimator']}. The standard normal density function (black line) is superimposed on the normalized histograms and shows qualitative agreement.
  • Figure 2: Bias (left) and variance (right) of estimated rates vs. number of samples (shots) from the SI1000 DEM for 3 rounds of a $d=3$ surface code. Rates were estimated from sampled syndromes via the moment-based Algorithm \ref{['alg:estimate-parameters-moments']} (blue) or the parity-based Algorithm \ref{['alg:estimate-parameters-parities']} (orange). In the left-hand plot, error bars denote the standard error of the mean, $\sqrt{\mathsf{Var}(\hat{\theta} - \theta)/E}$. In the right-hand plot, the function $\langle \theta \rangle / N$ is shown (black, dashed line) as a guide to the eye, where $\langle \theta \rangle$ is the mean of the true DEM rates.
  • Figure 3: The SNR (\ref{['eq:depolarization-snr']}) as a function of depolarization for different shot-counts.
  • Figure 4: Scaling of rate estimation algorithms with DEM size. Solid lines and circles: repetition codes with $d \in \{5, 6, ..., 42, 43\}$. Dashed lines and triangles are $XZZX$-surface codes with $d \in \{3, 5, 7\}$. All syndrome data-sets had $10^6$ shots. Blue represents moment-based Algorithm \ref{['alg:estimate-parameters-moments']}, whereas orange denotes parity-based Algorithm \ref{['alg:estimate-parameters-parities']}. Left: Run-time of rate-estimation algorithms as a function of $E$, the number of hyperedges in the DEM. Right: Time required to calculate necessary statistics (moments or depolarizations) from syndromes vs $E$.
  • Figure 5: Comparison of methods for estimating variance of DEM hyperedge rates. Shown are the empirical cumulative distribution functions of residuals between estimated and true rates, normalized by the square root of the respective approximations of variance, where estimates of rates and variances are derived from $10^6$ simulated shots comprising 7 rounds of a distance-7 surface code with the SI1000 noise model. Better approximations adhere more closely to the standard normal distribution (gray, dashed line).
  • ...and 9 more figures

Theorems & Definitions (10)

  • Theorem 1
  • Theorem 2
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Theorem
  • proof
  • Theorem
  • proof