Discovering autonomous quantum error correction via deep reinforcement learning
Yue Yin, Tailong Xiao, Xiaoyang Deng, Ming He, Jianping Fan, Guihua Zeng
TL;DR
This work addresses the challenge of discovering practical AQEC codes for bosonic systems under realistic loss channels by combining curriculum learning with deep reinforcement learning (DRL). It introduces a semi-analytical master-equation solver to accelerate training and proposes a two-phase curriculum that first identifies encodings exceeding the breakeven fidelity $F_{be}$ and then optimizes their long-term stability. The key finding is the GRL code with logical states $|0_L angle=|4 angle$ and $|1_L angle=|7 angle$, and an engineered Lindblad operator $L_{ ext{eng}}\propto|3 anglera2|+|4 anglera3|+|6 anglera5|+|7 anglera6|$, which maintains high fidelity even with double-photon loss, in agreement with KL-condition analysis for the error set $ ext{E}=ig\{I,ig(rac{}{}ig)igig brace$. The results show improved robustness to phase and amplitude damping, and feasible experimental implementation with short Hamiltonian distance ($d_g=1$), indicating strong potential for fault-tolerant quantum memories. Overall, curriculum-learning guided DRL offers a powerful framework for discovering adaptable, high-performance AQEC codes in early fault-tolerant quantum systems.
Abstract
Quantum error correction is essential for fault-tolerant quantum computing. However, standard methods relying on active measurements may introduce additional errors. Autonomous quantum error correction (AQEC) circumvents this by utilizing engineered dissipation and drives in bosonic systems, but identifying practical encoding remains challenging due to stringent Knill-Laflamme conditions. In this work, we utilize curriculum learning enabled deep reinforcement learning to discover Bosonic codes under approximate AQEC framework to resist both single-photon and double-photon losses. We present an analytical solution of solving the master equation under approximation conditions, which can significantly accelerate the training process of reinforcement learning. The agent first identifies an encoded subspace surpassing the breakeven point through rapid exploration within a constrained evolutionary time-frame, then strategically fine-tunes its policy to sustain this performance advantage over extended temporal horizons. We find that the two-phase trained agent can discover the optimal set of codewords, i.e., the Fock states $\ket{4}$ and $\ket{7}$ considering the effect of both single-photon and double-photon loss. We identify that the discovered code surpasses the breakeven threshold over a longer evolution time and achieve the state-of-art performance. We also analyze the robustness of the code against the phase damping and amplitude damping noise. Our work highlights the potential of curriculum learning enabled deep reinforcement learning in discovering the optimal quantum error correct code especially in early fault-tolerant quantum systems.
