Extrapolative Quantum Error Mitigation in Continuous-Variable Systems beyond the Training Horizon

Jingpeng Zhang; Shengyong Li; Jie Han; Qianchuan Zhao; Jing Zhang; Zeliang Xiang

Extrapolative Quantum Error Mitigation in Continuous-Variable Systems beyond the Training Horizon

Jingpeng Zhang, Shengyong Li, Jie Han, Qianchuan Zhao, Jing Zhang, Zeliang Xiang

TL;DR

This work introduces a framework for extrapolative quantum error mitigation based on a time-conditioned Swin Transformer and establishes extrapolative QEM as a practical route to mitigating noise in CV quantum systems without exhaustive training data.

Abstract

Continuous-variable (CV) quantum systems provide a versatile platform for quantum information processing, in which quantum states can be represented in the quadrature phase space. In realistic implementations, environmental noise, primarily photon loss and dephasing, progressively degrades these states. Machine-learning-based quantum error mitigation (QEM) has recently emerged as a promising approach to suppress such noise; however, existing methods are typically limited to the training horizon and require training data that cover the entire evolution, which is experimentally demanding. Here we introduce a framework for extrapolative quantum error mitigation based on a time-conditioned Swin Transformer. By explicitly embedding the evolution time via adaptive layer normalization, the model learns a correction map that accounts for the continuous accumulation of noise while capturing nonlocal phase-space correlations. Numerical simulations under both Markovian and non-Markovian noise demonstrate accurate state recovery in the long-time regime, where existing approaches deteriorate. Our results establish extrapolative QEM as a practical route to mitigating noise in CV quantum systems without exhaustive training data.

Extrapolative Quantum Error Mitigation in Continuous-Variable Systems beyond the Training Horizon

TL;DR

Abstract

Paper Structure (14 sections, 9 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 14 sections, 9 equations, 4 figures, 1 table, 1 algorithm.

Introduction
Methods
Problem Formulation
Data Generation via Fiducial Processes
Time-Conditioned Neural Architecture
Training and Inference
Numerical Demonstrations
Overview and Experimental Protocols
Markovian Regime: Multi-channel Snapshot Protocol
Non-Markovian Regime: Iterative Step-wise Protocol
Model Configurations and Generalization Protocol
Markovian Dynamics: Amplitude Calibration in Long-time Evolution
Non-Markovian Dynamics: Memory Effects and Fine Feature Preservation
Conclusion

Figures (4)

Figure 1: (a) Training data generation process via fiducial evolution. The system evolves under Hamiltonian $H$ to produce reference states at discrete times $t_{n}$ within the training horizon. Each reference state is then subjected to a fiducial sequence $\mathcal{U}_{\text{fid}}(\tau)$ in the presence of environmental noise, generating the corresponding noisy state. (b) Neural network architecture. Noisy Wigner function inputs are processed by time-conditioned Swin Transformer blocks. The evolution time $\tau$ is embedded through a multi-layer perceptron (MLP) and injected into each Swin transformer block via Adaptive Layer Normalization (AdaLN), enabling the network to model time-dependent noise accumulation.
Figure 2: Cosine similarity for Kerr nonlinearity ($K=1.2$). Training uses loss rates $\kappa\in\{0.3,0.4,0.5,0.6,0.7\}$ with test loss rate $\kappa\in\{0.3,0.4,0.5,0.6,0.7\}$. Within the training horizon $T_{\text{train}}=1.0$, both models achieve high similarity. Beyond this range, the CNN U-Net (green) degrades rapidly due to amplitude miscalibration, showing numerical overflow and spurious background excitation (red arrows) at $t=2.0$. In contrast, the time-conditioned Swin Transformer (orange) maintains stable reconstruction by dynamically adapting the normalization scale through AdaLN.
Figure 3: Extrapolation for driven squeezing dynamics (training loss rates rescaled by $1/3$, testing at $\kappa\in\{0.1,0.133,0.167,0.2,0.233\}$). The reduced dissipation mitigates amplitude errors; the CNN U-Net (green) achieves higher similarity ($\sim$0.92) than in Kerr, because cosine similarity is insensitive to global amplitude deviations when geometric structure is preserved. The Swin Transformer (orange) maintains superior similarity ($\sim$0.97).
Figure 4: Performance under non-Markovian dynamics (training loss rate at $\kappa\in\{0.3,0.4,0.5,0.6,0.7\}$, testing at $\kappa_{\text{forward}}=0.3$). The dashed line marks the training horizon $t=1.0$. Beyond this range ($t>1.0$), the CNN U-Net (green triangles) degrades due to shape distortion and loss of fine structure details. In contrast, the Swin Transformer (orange squares) better preserves structural correlations through sensitive extraction, maintaining $\sim$0.92 similarity with characteristic non-monotonic behavior.

Extrapolative Quantum Error Mitigation in Continuous-Variable Systems beyond the Training Horizon

TL;DR

Abstract

Extrapolative Quantum Error Mitigation in Continuous-Variable Systems beyond the Training Horizon

Authors

TL;DR

Abstract

Table of Contents

Figures (4)