Simultaneous Discovery of Quantum Error Correction Codes and Encoders with a Noise-Aware Reinforcement Learning Agent
Jan Olle, Remmy Zen, Matteo Puviani, Florian Marquardt
TL;DR
This work tackles the challenge of hardware-aware quantum error correction by automatically discovering both stabilizer codes and their encoding circuits with a noise-aware reinforcement learning agent. It introduces a KL-based reward and a vectorized Clifford simulator to scale code discovery to up to ${[[n,k,d]]} = [[20,13,3]]$ and ${[[11,1,5]]}$, while a noise-aware meta-agent generalizes strategies across asymmetric depolarizing channels using the bias parameter $c_Z$. The approach leverages the stabilizer formalism to enable fast Clifford simulations and enables simultaneous discovery across multiple noise models, including a CSS-focused extension that substantially reduces memory needs and broadens scalability. The results recover known codes, reveal diverse code families, and demonstrate transfer learning between noise settings, with practical pathways toward hardware-adapted accelerated discovery of QEC on a broad range of platforms. The work lays a foundation for scalable, hardware-tuned QEC code search and encoding synthesis, and points to fault-tolerant extensions and larger-scale CSS-based explorations as promising future directions.
Abstract
In the ongoing race towards experimental implementations of quantum error correction (QEC), finding ways to automatically discover codes and encoding strategies tailored to the qubit hardware platform is emerging as a critical problem. Reinforcement learning (RL) has been identified as a promising approach, but so far it has been severely restricted in terms of scalability. In this work, we significantly expand the power of RL approaches to QEC code discovery. Explicitly, we train an RL agent that automatically discovers both QEC codes and their encoding circuits for a given gate set, qubit connectivity and error model, from scratch. This is enabled by a reward based on the Knill-Laflamme conditions and a vectorized Clifford simulator, allowing us to scale our results to 20 physical qubits and distance 5 codes. Moreover, we introduce the concept of a noise-aware meta-agent, which learns to produce encoding strategies simultaneously for a range of noise models, thus leveraging transfer of insights between different situations. Our approach opens the door towards hardware-adapted accelerated discovery of QEC approaches across the full spectrum of quantum hardware platforms of interest.
