Table of Contents
Fetching ...

QAGT-MLP: An Attention-Based Graph Transformer for Small and Large-Scale Quantum Error Mitigation

Seyed Mohamad Ali Tousi, G. N. DeSouza

TL;DR

Noisy quantum devices demand error mitigation that balances accuracy with low shot and processing overhead. QAGT-MLP introduces a lightweight graph-attention architecture that encodes circuits as graphs and uses dual-context features—global structural context and local lightcone context—concatenated with circuit descriptors and noisy observables, followed by an MLP to predict mitigated values. The approach demonstrates superior mean error and error stability over a state-of-the-art RF baseline on large-scale 100-qubit TFIM circuits, and maintains ZNE-like accuracy across varying Trotter steps without the associated overhead. The work offers a scalable, practical pathway to QEM in modern quantum workloads, combining structure-aware representation learning with efficient inference and deployment.

Abstract

Noisy quantum devices demand error-mitigation techniques to be accurate yet simple and efficient in terms of number of shots and processing time. Many established approaches (e.g., extrapolation and quasi-probability cancellation) impose substantial execution or calibration overheads, while existing learning-based methods have difficulty scaling to large and deep circuits. In this research, we introduce QAGT-MLP: an attention-based graph transformer tailored for small- and large-scale quantum error mitigation (QEM). QAGT-MLP encodes each quantum circuit as a graph whose nodes represent gate instances and whose edges capture qubit connectivity and causal adjacency. A dual-path attention module extracts features around measured qubits at two scales or contexts: 1) graph-wide global structural context; and 2) fine-grained local lightcone context. These learned representations are concatenated with circuit-level descriptor features and the circuit noisy expected values, then they are passed to a lightweight MLP to predict the noise-mitigated values. On large-scale 100-qubit Trotterized 1D Transverse-Field Ising Models -- TFIM circuits -- the proposed QAGT-MLP outperformed state-of-the-art learning baselines in terms of mean error and error variability, demonstrating strong validity and applicability in real-world QEM scenarios under matched shot budgets. By using attention to fuse global structures with local lightcone neighborhoods, QAGT-MLP achieves high mitigation quality without the increasing noise scaling or resource demand required by classical QEM pipelines, while still offering a scalable and practical path to QEM in modern and future quantum workloads.

QAGT-MLP: An Attention-Based Graph Transformer for Small and Large-Scale Quantum Error Mitigation

TL;DR

Noisy quantum devices demand error mitigation that balances accuracy with low shot and processing overhead. QAGT-MLP introduces a lightweight graph-attention architecture that encodes circuits as graphs and uses dual-context features—global structural context and local lightcone context—concatenated with circuit descriptors and noisy observables, followed by an MLP to predict mitigated values. The approach demonstrates superior mean error and error stability over a state-of-the-art RF baseline on large-scale 100-qubit TFIM circuits, and maintains ZNE-like accuracy across varying Trotter steps without the associated overhead. The work offers a scalable, practical pathway to QEM in modern quantum workloads, combining structure-aware representation learning with efficient inference and deployment.

Abstract

Noisy quantum devices demand error-mitigation techniques to be accurate yet simple and efficient in terms of number of shots and processing time. Many established approaches (e.g., extrapolation and quasi-probability cancellation) impose substantial execution or calibration overheads, while existing learning-based methods have difficulty scaling to large and deep circuits. In this research, we introduce QAGT-MLP: an attention-based graph transformer tailored for small- and large-scale quantum error mitigation (QEM). QAGT-MLP encodes each quantum circuit as a graph whose nodes represent gate instances and whose edges capture qubit connectivity and causal adjacency. A dual-path attention module extracts features around measured qubits at two scales or contexts: 1) graph-wide global structural context; and 2) fine-grained local lightcone context. These learned representations are concatenated with circuit-level descriptor features and the circuit noisy expected values, then they are passed to a lightweight MLP to predict the noise-mitigated values. On large-scale 100-qubit Trotterized 1D Transverse-Field Ising Models -- TFIM circuits -- the proposed QAGT-MLP outperformed state-of-the-art learning baselines in terms of mean error and error variability, demonstrating strong validity and applicability in real-world QEM scenarios under matched shot budgets. By using attention to fuse global structures with local lightcone neighborhoods, QAGT-MLP achieves high mitigation quality without the increasing noise scaling or resource demand required by classical QEM pipelines, while still offering a scalable and practical path to QEM in modern and future quantum workloads.

Paper Structure

This paper contains 21 sections, 1 equation, 4 figures, 2 tables.

Figures (4)

  • Figure 1: The proposed QAGT-MLP model. The circuit measurement-based and structural features are concatenated with the local lightcone and global circuit-wide contexts to form an input feature map to the MLP. The MLP produces the final estimation of the corrected expected values.
  • Figure 2: The comparison between the mean absolute error (distance) to the ideal values for each qubit resulted from QAGT-MLP and RF. The QAGT-MLP outperforms RF is all the qubits in both mean distance and its standard deviation.
  • Figure 3: The results of ablation study on the proposed components of the QAGT-MLP. The columns are representing: Full: the full proposed QAGT-MLP architecture, GCN_Backbone: using a graph convolutional network as the backbone instead of Graph Transformers, NoGlobal: not extracting the global context features, and NoLightcone: not extracting the causal lightcone features for each measured qubit.
  • Figure 4: Mean error of the QAGT-MLP, RF, and unmitigated outputs with respect to the ZNE across Trotter steps. The QAGT-MLP exhibits minimal deviation from ZNE, indicating that it can effectively replace ZNE without additional sampling overhead.