Table of Contents
Fetching ...

QUIET-SR: Quantum Image Enhancement Transformer for Single Image Super-Resolution

Siddhant Dutta, Nouhaila Innan, Khadijeh Najafi, Sadok Ben Yahia, Muhammad Shafique

TL;DR

The paper tackles single-image super-resolution under the constraints of near-term quantum hardware by introducing QUIET-SR, a hybrid quantum-classical transformer that integrates Shifted Quantum Window attention built on variational quantum circuits. It extends the Swin Transformer with quantum attention to capture non-local, high-dimensional feature interactions while keeping circuit depth and qubit requirements suitable for NISQ devices. Empirical results on MNIST, FashionMNIST, and MedMNIST show competitive PSNR/SSIM with a compact model (~1.55 MB) and demonstrate robustness to realistic quantum noise, supported by analyses of long-range dependencies (Distance Correlation and HSIC). The paper also proposes scalable batching for multi-QPU–GPU quantum computing and offers a concrete resource and performance framework that informs future quantum vision research.

Abstract

Recent advancements in Single-Image Super-Resolution (SISR) using deep learning have significantly improved image restoration quality. However, the high computational cost of processing high-resolution images due to the large number of parameters in classical models, along with the scalability challenges of quantum algorithms for image processing, remains a major obstacle. In this paper, we propose the Quantum Image Enhancement Transformer for Super-Resolution (QUIET-SR), a hybrid framework that extends the Swin transformer architecture with a novel shifted quantum window attention mechanism, built upon variational quantum neural networks. QUIET-SR effectively captures complex residual mappings between low-resolution and high-resolution images, leveraging quantum attention mechanisms to enhance feature extraction and image restoration while requiring a minimal number of qubits, making it suitable for the Noisy Intermediate-Scale Quantum (NISQ) era. We evaluate our framework in MNIST (30.24 PSNR, 0.989 SSIM), FashionMNIST (29.76 PSNR, 0.976 SSIM) and the MedMNIST dataset collection, demonstrating that QUIET-SR achieves PSNR and SSIM scores comparable to state-of-the-art methods while using fewer parameters. Our efficient batching strategy directly enables massive parallelization on multiple QPU's paving the way for practical quantum-enhanced image super-resolution through coordinated QPU-GPU quantum supercomputing.

QUIET-SR: Quantum Image Enhancement Transformer for Single Image Super-Resolution

TL;DR

The paper tackles single-image super-resolution under the constraints of near-term quantum hardware by introducing QUIET-SR, a hybrid quantum-classical transformer that integrates Shifted Quantum Window attention built on variational quantum circuits. It extends the Swin Transformer with quantum attention to capture non-local, high-dimensional feature interactions while keeping circuit depth and qubit requirements suitable for NISQ devices. Empirical results on MNIST, FashionMNIST, and MedMNIST show competitive PSNR/SSIM with a compact model (~1.55 MB) and demonstrate robustness to realistic quantum noise, supported by analyses of long-range dependencies (Distance Correlation and HSIC). The paper also proposes scalable batching for multi-QPU–GPU quantum computing and offers a concrete resource and performance framework that informs future quantum vision research.

Abstract

Recent advancements in Single-Image Super-Resolution (SISR) using deep learning have significantly improved image restoration quality. However, the high computational cost of processing high-resolution images due to the large number of parameters in classical models, along with the scalability challenges of quantum algorithms for image processing, remains a major obstacle. In this paper, we propose the Quantum Image Enhancement Transformer for Super-Resolution (QUIET-SR), a hybrid framework that extends the Swin transformer architecture with a novel shifted quantum window attention mechanism, built upon variational quantum neural networks. QUIET-SR effectively captures complex residual mappings between low-resolution and high-resolution images, leveraging quantum attention mechanisms to enhance feature extraction and image restoration while requiring a minimal number of qubits, making it suitable for the Noisy Intermediate-Scale Quantum (NISQ) era. We evaluate our framework in MNIST (30.24 PSNR, 0.989 SSIM), FashionMNIST (29.76 PSNR, 0.976 SSIM) and the MedMNIST dataset collection, demonstrating that QUIET-SR achieves PSNR and SSIM scores comparable to state-of-the-art methods while using fewer parameters. Our efficient batching strategy directly enables massive parallelization on multiple QPU's paving the way for practical quantum-enhanced image super-resolution through coordinated QPU-GPU quantum supercomputing.

Paper Structure

This paper contains 5 sections, 16 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: This grid compares low-resolution (14×14), high-resolution (28×28), and QUIET-SR super-resolution images across multiple MNIST-like datasets. The high-resolution row represents the ground truth, while the QUIET-SR row demonstrates the model's capability to reconstruct fine details and preserve structural integrity. The super-resolved images generated by QUIET-SR closely approximate the high-resolution ground truth, effectively enhancing image clarity and preserving key features.
  • Figure 2: High-level workflow of the QUIET-SR framework. The architecture processes a $14 \times 14$ Low-Resolution (LR) input through three main stages: (1) Shallow Feature Extraction, where initial convolutional layers capture low-frequency information such as edges, textures, and colors; (2) Deep Feature Extraction, which utilizes Quantum Residual Transformer Blocks (Quantum RTSB) containing the specific SQWIN (Shifted Quantum Window) attention mechanism to model complex dependencies; and (3) High Quality Image Reconstruction, where features are aggregated via global residual connections surmised via (4-5) upsampling via a Pixel Shuffle operation and final convolutions to synthesize the $28 \times 28$ Super-Resolved output. (6) The network is optimized by minimizing the $\mathcal{L}_1$ loss between the generated image and the High-Resolution ground truth. The QNN architecture uses angle embedding and basic entangler layers to transform features and optimize attention efficiency. The final stage, SR Image Reconstruction, synthesizes the SR output, demonstrating the advantages of quantum-enhanced image restoration
  • Figure 4: Noise-aware simulation results using Qiskit Aer with Depolarizing, Amplitude Damping, Phase Damping, & BitFlip noise models. Under low Phase Damping noise, QUIET-SR achieves PSNR $\approx 38.5$ dB & SSIM $\approx 0.974$, exceeding the noiseless baseline (PSNR $38.24$ dB, SSIM $0.973$). Performance remains close to baseline under Depolarizing noise (PSNR $\approx 38.15$ dB, SSIM $\approx 0.972$), indicating noise resilience.
  • Figure 5: This comparative analysis illustrates the relationship between embedding dimension (number of qubits in quantum systems) and PSNR. Empirical measurements (solid lines) are shown for both quantum and classical embeddings up to 10 qubits, beyond which projections (dashed lines) are generated using a logarithmic regression model. The shaded region represents the predicted performance advantage of QUIET-SR quantum embeddings over their classical counterparts as quantum hardware capabilities expand. The diverging trajectories suggest that quantum embeddings may offer increasingly significant advantages in image reconstruction quality as larger quantum systems become available, with the performance gap widening in proportion to system size.
  • Figure 6: Detailed architecture of the core QUIET-SR components.(1) The Shifted Quantum Window mechanism utilizes cyclic shifting and masking to partition the input into windows, enabling efficient cross-window interaction on quantum processors which can be distributed. (2) The Swin Quantum Attention V2 module replaces classical linear projections with variational quantum circuits ($U_\psi^Q, U_\psi^K, U_\psi^V$) and incorporates a Log-CPB (Log-spaced Continuous Position Bias) processed by a Q-MLP to compute scaled cosine attention. (3) The Q-SwinV2 Transformer Layer integrates the quantum attention mechanism with a Quantum MLP (Q-MLP), Layer Normalization (LN), and residual connections. (4) The Quantum Residual Transformer Block stacks multiple transformer layers (Q-S2TL) followed by a convolutional layer. (5) The Execution Backend executes the compiled quantum circuits on the chosen backend: a noiseless simulator, an error-mitigated simulator, or a real quantum processing unit (QPU). (6) The alternating processing strategy between Quantum Window (QW-MSA) and Quantum Shifted Window (QSW-MSA) attention layers, which facilitates global information flow and feature mixing.
  • ...and 3 more figures