QAGT-MLP: An Attention-Based Graph Transformer for Small and Large-Scale Quantum Error Mitigation
Seyed Mohamad Ali Tousi, G. N. DeSouza
TL;DR
Noisy quantum devices demand error mitigation that balances accuracy with low shot and processing overhead. QAGT-MLP introduces a lightweight graph-attention architecture that encodes circuits as graphs and uses dual-context features—global structural context and local lightcone context—concatenated with circuit descriptors and noisy observables, followed by an MLP to predict mitigated values. The approach demonstrates superior mean error and error stability over a state-of-the-art RF baseline on large-scale 100-qubit TFIM circuits, and maintains ZNE-like accuracy across varying Trotter steps without the associated overhead. The work offers a scalable, practical pathway to QEM in modern quantum workloads, combining structure-aware representation learning with efficient inference and deployment.
Abstract
Noisy quantum devices demand error-mitigation techniques to be accurate yet simple and efficient in terms of number of shots and processing time. Many established approaches (e.g., extrapolation and quasi-probability cancellation) impose substantial execution or calibration overheads, while existing learning-based methods have difficulty scaling to large and deep circuits. In this research, we introduce QAGT-MLP: an attention-based graph transformer tailored for small- and large-scale quantum error mitigation (QEM). QAGT-MLP encodes each quantum circuit as a graph whose nodes represent gate instances and whose edges capture qubit connectivity and causal adjacency. A dual-path attention module extracts features around measured qubits at two scales or contexts: 1) graph-wide global structural context; and 2) fine-grained local lightcone context. These learned representations are concatenated with circuit-level descriptor features and the circuit noisy expected values, then they are passed to a lightweight MLP to predict the noise-mitigated values. On large-scale 100-qubit Trotterized 1D Transverse-Field Ising Models -- TFIM circuits -- the proposed QAGT-MLP outperformed state-of-the-art learning baselines in terms of mean error and error variability, demonstrating strong validity and applicability in real-world QEM scenarios under matched shot budgets. By using attention to fuse global structures with local lightcone neighborhoods, QAGT-MLP achieves high mitigation quality without the increasing noise scaling or resource demand required by classical QEM pipelines, while still offering a scalable and practical path to QEM in modern and future quantum workloads.
