Scalable testing of quantum error correction
John Zhuoyang Ye, Jens Palsberg
TL;DR
This paper tackles the scalability problem in benchmarking quantum error correction by introducing ScaLER, which combines stratified fault injection with S-curve extrapolation to estimate logical-error rates beyond the reach of existing tools like Stim. By focusing testing on high-weight fault subspaces and fitting a predictive S-curve, ScaLER achieves accurate LER estimates at larger code distances (up to distance 17) within a practical 2-hour desktop budget, demonstrated on surface, toric, and QLDPC codes. The work provides a formal modeling framework for per-weight logical-error rates, a practical sweet-spot concept to balance accuracy and cost, and an end-to-end algorithm with open-source implementation. Overall, ScaLER offers a general and scalable pathway to benchmark high-quality QEC implementations, enabling more rapid assessment and comparison of fault-tolerant schemes under realistic noise models. The method's ability to extrapolate from high-weight data while maintaining high fidelity has potential to significantly accelerate the development and validation of fault-tolerant quantum architectures.
Abstract
The standard method for benchmarking quantum error-correction is randomized fault-injection testing. The state-of-the-art tool \stim is efficient for error correction implementations with distances of up to 10, but scales poorly to larger distances for low physical error rates. In this paper, we present a scalable approach that combines stratified fault injection with extrapolation. Our insight is that some of the fault space can be sampled efficiently, after which extrapolation is sufficient to complete the testing task. As a result, our tool scales to distance 17 for a physical error rate of 0.0005 with a two-hour time budget on a desktop. For this case, it estimated a logical error rate of $1.51 \times 10^{-11}$ with high confidence.
