Impacts of Decoder Latency on Utility-Scale Quantum Computer Architectures
Abdullah Khalid, Allyson Silva, Gebremedhin A. Dagnew, Tom Dvir, Oded Wertheim, Motty Gruda, Xiangzhou Kong, Mia Kramer, Zak Webb, Artur Scherer, Masoud Mohseni, Yonatan Cohen, Pooya Ronagh
TL;DR
This work addresses the bottleneck of reaction time in fault-tolerant quantum computing by linking decoder and communication latencies to the performance of a surface-code-based architecture. It introduces a dual-reaction-time model, γ_LS and γ_mem, showing that correction-qubit decoding imposes a memory-latency bottleneck that governs the achievable circuit throughput, even when lattice-surgery decoding can be parallelized. By developing logical-error-rate models for the post-corrected π/8 gadget and fitting lattice-surgery error parameters, the authors translate reaction-time effects into full-system resource estimates, revealing substantial physical-qubit overheads and runtime penalties for utility-scale circuits unless decoders and communications scale dramatically. The results underscore the need for faster decoders, higher-bandwidth interconnects, and potentially alternative codes or more efficient magic-state distillation to realize practical FTQC, with concrete implications for the required decoder counts (on the order of ~15k for a 10M-qubit QPU) and the space-time Pareto frontier of core processor, MSF, and correction-storage regions.
Abstract
The speed of a fault-tolerant quantum computer is dictated by the reaction time of its classical electronics, that is, the total time required by decoders and controllers to determine the outcome of a logical measurement and execute subsequent conditional logical operations. Despite its importance, the reaction time and its impact on the design of the logical microarchitecture of a quantum computer are not well understood. In this work, we build, for a surface code based architecture, a model for the reaction time in which the decoder latency is based on parallel space- and time-window decoding methods, and communication latencies are drawn from our envisioned quantum execution environment comprising a high-speed network of quantum processing units, controllers, decoders, and high-performance computing nodes. We use this model to estimate the increase in the logical error rate of magic state injections as a function of the reaction time. Next, we show how the logical microarchitecture can be optimized with respect to the reaction time, and then present detailed full-system quantum and classical resource estimates for executing utility-scale quantum circuits based on realistic hardware noise parameters and state-of-the-art decoding times. For circuits with $10^{6}$--$10^{11}$ $T$ gates involving 200--2000 logical qubits, under a $Λ=9.3$ hardware model representative of a realistic target for superconducting quantum processors operating at a 2.86 MHz stabilization frequency, we show that even decoding at a sub-microsecond per stabilization round speed introduces substantial resource overheads: approximately 100k--250k additional physical qubits for correction qubit storage in the magic state factory; 300k--1.75M extra physical qubits in the core processor due to the code distance increase of $d$ to $d+4$ for extra memory protection; and a longer runtime by roughly a factor of 100.
