Table of Contents
Fetching ...

Hierarchical Qubit-Merging Transformer for Quantum Error Correction

Seong-Joon Park, Hee-Youl Kwak, Yongjune Kim

TL;DR

The proposed Hierarchical Qubit-Merging Transformer (HQMT), a novel and general decoding framework that explicitly leverages the structural graph of stabilizer codes to learn error correlations across multiple scales, provides a scalable and effective framework for surface code decoding.

Abstract

For reliable large-scale quantum computation, a quantum error correction (QEC) scheme must effectively resolve physical errors to protect logical information. Leveraging recent advances in deep learning, neural network-based decoders have emerged as a promising approach to enhance the reliability of QEC. We propose the Hierarchical Qubit-Merging Transformer (HQMT), a novel and general decoding framework that explicitly leverages the structural graph of stabilizer codes to learn error correlations across multiple scales. Our architecture first computes attention locally on structurally related groups of stabilizers and then systematically merges these qubit-centric representations to build a global view of the error syndrome. The proposed HQMT achieves substantially lower logical error rates for surface codes by integrating a dedicated qubit-merging layer within the transformer architecture. Across various code distances, HQMT significantly outperforms previous neural network-based QEC decoders as well as a powerful belief propagation with ordered statistics decoding (BP+OSD) baseline. This hierarchical approach provides a scalable and effective framework for surface code decoding, advancing the realization of reliable quantum computing.

Hierarchical Qubit-Merging Transformer for Quantum Error Correction

TL;DR

The proposed Hierarchical Qubit-Merging Transformer (HQMT), a novel and general decoding framework that explicitly leverages the structural graph of stabilizer codes to learn error correlations across multiple scales, provides a scalable and effective framework for surface code decoding.

Abstract

For reliable large-scale quantum computation, a quantum error correction (QEC) scheme must effectively resolve physical errors to protect logical information. Leveraging recent advances in deep learning, neural network-based decoders have emerged as a promising approach to enhance the reliability of QEC. We propose the Hierarchical Qubit-Merging Transformer (HQMT), a novel and general decoding framework that explicitly leverages the structural graph of stabilizer codes to learn error correlations across multiple scales. Our architecture first computes attention locally on structurally related groups of stabilizers and then systematically merges these qubit-centric representations to build a global view of the error syndrome. The proposed HQMT achieves substantially lower logical error rates for surface codes by integrating a dedicated qubit-merging layer within the transformer architecture. Across various code distances, HQMT significantly outperforms previous neural network-based QEC decoders as well as a powerful belief propagation with ordered statistics decoding (BP+OSD) baseline. This hierarchical approach provides a scalable and effective framework for surface code decoding, advancing the realization of reliable quantum computing.

Paper Structure

This paper contains 3 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: The architecture of the proposed HQMT. The model consists of two main stages. Stage 1 embeds the raw syndrome into separate Z- and X-type tokens and processes them in parallel to learn fine-grained correlations. A Qubit-Merging layer then integrates these representations into unified, coarse-grained tokens for each qubit. Stage 2 processes this merged sequence to learn non-local correlations between the unified qubit-level representations for all qubits. Finally, a fully connected layer classifies the input syndrome into one of the four logical error classes.
  • Figure 2: Illustration of the qubit-merging layer. For each physical qubit (gray circle), the layer takes the representations of its associated Z-stabilizers (red circles) and X-stabilizers (blue circles) and fuses them into a single token representing the complete local stabilizer context. This is implemented by concatenating the $d_{\text{model}}$-dimensional Z- and X-tokens for each qubit into a $2d_{\text{model}}$-dimensional vector, which is then projected back to $d_{\text{model}}$ by a fully connected FC layer. This process transforms the two fine-grained token sequences into a single coarse-grained sequence for the subsequent hierarchical stage.
  • Figure 3: Logical error rate as a function of the physical error rate for the surface code under a depolarizing noise model. The performance of the proposed HQMT is benchmarked against MWPM, FFNN, and CNN decoders for code distances (a) $d=3$, (b) $d=5$, (c) $d=7$, and (d) $d=9$.
  • Figure 4: Performance comparison of the proposed HQMT decoder against the strong BP+OSD baseline. The LER is plotted as a function of the physical error rate ($p$) for the surface code under the depolarizing noise model. Results are shown for code distances $d=5, 7, 9,$ and $11$. HQMT consistently outperforms the BP+OSD decoder across all tested distances.
  • Figure 5: (a) Ablation study of the HQMT architecture for the $d=11$ surface code. The LER of the full HQMT model is compared against two variants: "Stage 1 only," which uses only the Stage 1, and "Stage 2 only," which uses only the Stage 2. (b) Decoding performance of HQMT with different numbers of transformer blocks ($N$) per hierarchical stage for a distance $d=11$ surface code. Increasing $N$ from 1 to 3 yields an significant improvement, while the gain from $N=3$ to $N=5$ is marginal.