Table of Contents
Fetching ...

Differentiable Maximum Likelihood Noise Estimation for Quantum Error Correction

Hanyan Cao, Dongyang Feng, Cheng Ye, Feng Pan

TL;DR

This work introduces a differentiable Maximum Likelihood Estimation (dMLE) framework that enables exact, efficient, and fully differentiable computation of syndrome log-likelihoods, allowing circuit-level noise parameters to be optimized directly via gradient descent.

Abstract

Accurate noise estimation is essential for fault-tolerant quantum computing, as decoding performance depends critically on the fidelity of the circuit-level noise parameters. In this work, we introduce a differentiable Maximum Likelihood Estimation (dMLE) framework that enables exact, efficient, and fully differentiable computation of syndrome log-likelihoods, allowing circuit-level noise parameters to be optimized directly via gradient descent. Leveraging the exact Planar solver for repetition codes and a novel, simplified Tensor Network (TN) architecture combined with optimized contraction path finding for surface codes, our method achieves tractable and fully differentiable likelihood evaluation even for distance 5 surface codes with up to 25 rounds. Our method recovers the underlying error probabilities with near-exact precision in simulations and reduces logical error rates by up to 30.6(3)% for repetition codes and 8.1(2)% for surface codes on experimental data from Google's processor compared to previous state-of-the-art methods: correlation analysis and Reinforcement Learning (RL) methods. Our approach yields provably optimal, decoder-independent error priors by directly maximizing the syndrome likelihood, offering a powerful noise estimation and control tool for unlocking the full potential of current and future error-corrected quantum processors.

Differentiable Maximum Likelihood Noise Estimation for Quantum Error Correction

TL;DR

This work introduces a differentiable Maximum Likelihood Estimation (dMLE) framework that enables exact, efficient, and fully differentiable computation of syndrome log-likelihoods, allowing circuit-level noise parameters to be optimized directly via gradient descent.

Abstract

Accurate noise estimation is essential for fault-tolerant quantum computing, as decoding performance depends critically on the fidelity of the circuit-level noise parameters. In this work, we introduce a differentiable Maximum Likelihood Estimation (dMLE) framework that enables exact, efficient, and fully differentiable computation of syndrome log-likelihoods, allowing circuit-level noise parameters to be optimized directly via gradient descent. Leveraging the exact Planar solver for repetition codes and a novel, simplified Tensor Network (TN) architecture combined with optimized contraction path finding for surface codes, our method achieves tractable and fully differentiable likelihood evaluation even for distance 5 surface codes with up to 25 rounds. Our method recovers the underlying error probabilities with near-exact precision in simulations and reduces logical error rates by up to 30.6(3)% for repetition codes and 8.1(2)% for surface codes on experimental data from Google's processor compared to previous state-of-the-art methods: correlation analysis and Reinforcement Learning (RL) methods. Our approach yields provably optimal, decoder-independent error priors by directly maximizing the syndrome likelihood, offering a powerful noise estimation and control tool for unlocking the full potential of current and future error-corrected quantum processors.
Paper Structure (5 sections, 12 equations, 9 figures, 1 table)

This paper contains 5 sections, 12 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Differentiable Maximum Likelihood Noise Estimation (dMLE) Framework. (1) Initialization: Observed detection events $\mathbf{s}$ and parameters $\bm{\theta}$ are mapped to dual spin models (Repetition codes) or Tensor Networks (Surface codes). (2) Evaluation: The NLL loss is derived from the partition function $\mathcal{Z}$ or TN contraction. (3) Update: Since all of the above processes are differentiable, parameters are iteratively refined via gradient descent until convergence to the maximum likelihood estimate. (a) DEM of $d=3$, $r=5$ repetition code memory circuit and dual spin glass model. The white (red) nodes are detectors (triggered) and the black edges are error mechanisms. Each black dashed node is a dual spin. The blue dashed node is the logical spin, whose $\mp1$ configurations represent whether a logical error occurs. The red dashed node is an auxiliary spin, which is added to reconstruct the planarity. (b) Tensor Network of $d=3$ and $r=2$ surface code. White nodes are XOR tensors, which enforce detector parity, with red dangling edges imposing detection constraints. Probability tensors (black edges, colored surfaces/bodies) encode error mechanisms, assigning weights $\theta_i$ and $1-\theta_i$ to triggered and non-triggered states, respectively.
  • Figure 2: (Upper) Optimization dynamics for a $d=7$ and $r=7$ repetition code. The orange dashed line represents the noisy initialization, while black lines denote the ground truth. Colored dots (blue to red) indicate parameter evolution during training, with final red dots showing excellent agreement with the true values. Inset: Correlation between the optimization objective (NLL loss) and parameter accuracy (relative error), demonstrating that minimizing approximate NLL directly improves estimation fidelity. (Lower) Surface code reconstruction ($d=3$, $r=5$, $\epsilon_p=0.001$). Comparison of initial (orange dashed), optimized (red dots), and true (black lines) parameters for a rotated surface code using Tensor Network contraction. For visual clarity, the red dots in the main plot represent the mean results obtained by grouping different optimized priors with the same ground truth value. The method effectively recovers the underlying detector error models.
  • Figure 3: Numerical Results. (left) Experimental validation on BAQIS superconducting repetition codes. Logical error probability per round $P_L$ versus code distance $d\in\{3,5,7,9\}$. Dashed lines correspond to initial parameters derived from correlation analysis, while solid lines represent performance after our optimization. The method consistently suppresses errors for both MWPM and Planar decoders. Inset: Relative improvement percentages, highlighting a significant gain for the Planar decoder at larger distances. (right) Optimization on Google Sycamore surface code ($d=5$) dataset. Logical error rates compared across three decoders (TN, Tesseract, Belief Matching) using DEMs derived from three distinct sources: our TN-based optimization, RL-based Belief Matching, and correlation analysis. Our TN-optimized DEM demonstrates superior transferability, achieving the lowest logical error rates for high-performance decoders (TN and Tesseract), whereas RL-optimized models show limited generalizability outside their training decoder.
  • Figure S1: Equivalent transformation of Ising model.
  • Figure S2: Benchmark on a $d=3$, $r=5$ repetition code. This model contains $12$ detectors and $4096$ detector configurations. For every configuration $\mathbf{s}$, we employed the Planar algorithm to exactly compute the probabilities $p_{\mathrm{data}}(\mathbf{s})$ and $p_{\bm{\theta}}(\mathbf{s})$, yielding an exact NLL loss. The orange line represents the initial random perturbative values. The gray gradient lines illustrate the evolution of parameters during training. The red dots and black line represent converged results and the ground truth values.
  • ...and 4 more figures