Table of Contents
Fetching ...

Versatile and Efficient Medical Image Super-Resolution Via Frequency-Gated Mamba

Wenfeng Huang, Xiangyun Liao, Wei Cao, Wenjing Jia, Weixin Si

TL;DR

FGMamba tackles medical image SR by unifying global dependency modeling with high-frequency detail restoration in a lightweight framework. It introduces the Gated Attention-enhanced State-Space Module (GASM) and the Pyramid Frequency Fusion Module (PFFM) to achieve efficient long-range context capture and multiscale high-frequency fusion, respectively, while constraining the parameter count to under $<0.75M$. Across five modalities (ultrasound, OCT, MRI, CT, endoscopy), it outperforms CNN-, Transformer-, and prior Mamba-based SR methods in PSNR/SSIM, demonstrating strong generalization and a practical footprint for clinical deployment. The results validate frequency-aware state-space modeling as a scalable and accurate approach for medical image enhancement with potential impact on downstream tasks like segmentation and diagnosis.

Abstract

Medical image super-resolution (SR) is essential for enhancing diagnostic accuracy while reducing acquisition cost and scanning time. However, modeling both long-range anatomical structures and fine-grained frequency details with low computational overhead remains challenging. We propose FGMamba, a novel frequency-aware gated state-space model that unifies global dependency modeling and fine-detail enhancement into a lightweight architecture. Our method introduces two key innovations: a Gated Attention-enhanced State-Space Module (GASM) that integrates efficient state-space modeling with dual-branch spatial and channel attention, and a Pyramid Frequency Fusion Module (PFFM) that captures high-frequency details across multiple resolutions via FFT-guided fusion. Extensive evaluations across five medical imaging modalities (Ultrasound, OCT, MRI, CT, and Endoscopic) demonstrate that FGMamba achieves superior PSNR/SSIM while maintaining a compact parameter footprint ($<$0.75M), outperforming CNN-based and Transformer-based SOTAs. Our results validate the effectiveness of frequency-aware state-space modeling for scalable and accurate medical image enhancement.

Versatile and Efficient Medical Image Super-Resolution Via Frequency-Gated Mamba

TL;DR

FGMamba tackles medical image SR by unifying global dependency modeling with high-frequency detail restoration in a lightweight framework. It introduces the Gated Attention-enhanced State-Space Module (GASM) and the Pyramid Frequency Fusion Module (PFFM) to achieve efficient long-range context capture and multiscale high-frequency fusion, respectively, while constraining the parameter count to under . Across five modalities (ultrasound, OCT, MRI, CT, endoscopy), it outperforms CNN-, Transformer-, and prior Mamba-based SR methods in PSNR/SSIM, demonstrating strong generalization and a practical footprint for clinical deployment. The results validate frequency-aware state-space modeling as a scalable and accurate approach for medical image enhancement with potential impact on downstream tasks like segmentation and diagnosis.

Abstract

Medical image super-resolution (SR) is essential for enhancing diagnostic accuracy while reducing acquisition cost and scanning time. However, modeling both long-range anatomical structures and fine-grained frequency details with low computational overhead remains challenging. We propose FGMamba, a novel frequency-aware gated state-space model that unifies global dependency modeling and fine-detail enhancement into a lightweight architecture. Our method introduces two key innovations: a Gated Attention-enhanced State-Space Module (GASM) that integrates efficient state-space modeling with dual-branch spatial and channel attention, and a Pyramid Frequency Fusion Module (PFFM) that captures high-frequency details across multiple resolutions via FFT-guided fusion. Extensive evaluations across five medical imaging modalities (Ultrasound, OCT, MRI, CT, and Endoscopic) demonstrate that FGMamba achieves superior PSNR/SSIM while maintaining a compact parameter footprint (0.75M), outperforming CNN-based and Transformer-based SOTAs. Our results validate the effectiveness of frequency-aware state-space modeling for scalable and accurate medical image enhancement.

Paper Structure

This paper contains 18 sections, 14 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Overall architecture of the proposed FGMamba. It consists of an initial convolution, several FGBlocks, and a reconstruction module with pixel-shuffle upsampling. Each FGBlock contains multiple GASM (Gated Attention State Space Modules), a frequency-enhancing PFFM (Pyramid Frequency Fusion Module), and additional Mamba residual connections. The submodules are illustrated below: (1) GASM incorporates VSSM2D with gated spatial/channel attention, (2) GAU enhances feature selection via dual attention gating, and (3) PFFM extracts and fuses high-frequency components across multiple scales via FFT-based filtering and residual learning.
  • Figure 2: Visual comparison across five medical modalities: (a) Ultrasound ($\times$2), (b) CT ($\times$4), (c) OCT ($\times$4), (d) Endoscopic ($\times$4), and (e) MRI ($\times$4). Red boxes highlight diagnostically critical structures (vessels, lesions, tissue textures), where FGMamba achieves sharper, more detailed reconstructions to aid radiologist diagnosis and clinical decision-making.