Table of Contents
Fetching ...

Global and Local Mamba Network for Multi-Modality Medical Image Super-Resolution

Zexin Ji, Beiji Zou, Xiaoyan Kui, Sebastien Thureau, Su Ruan

TL;DR

The paper addresses the challenge of efficiently achieving high-quality multi-modality medical image super-resolution by balancing global context and local detail. It introduces GLMamba, a two-branch architecture with a global Mamba for low-resolution inputs and a local Mamba for high-resolution references, augmented by deformable and modulatory blocks and a dedicated multi-modality fusion block, coupled with a contrastive edge loss. Empirical results on BraTS2021 and IXI show improvements in PSNR, SSIM, and downstream segmentation Dice, along with competitive parameter efficiency compared with state-of-the-art methods. The approach holds practical potential for faster, more accurate clinical MR SR and downstream tasks, while future work includes extending to 3D data and integrating alignment with SR in a joint framework.

Abstract

Convolutional neural networks and Transformer have made significant progresses in multi-modality medical image super-resolution. However, these methods either have a fixed receptive field for local learning or significant computational burdens for global learning, limiting the super-resolution performance. To solve this problem, State Space Models, notably Mamba, is introduced to efficiently model long-range dependencies in images with linear computational complexity. Relying on the Mamba and the fact that low-resolution images rely on global information to compensate for missing details, while high-resolution reference images need to provide more local details for accurate super-resolution, we propose a global and local Mamba network (GLMamba) for multi-modality medical image super-resolution. To be specific, our GLMamba is a two-branch network equipped with a global Mamba branch and a local Mamba branch. The global Mamba branch captures long-range relationships in low-resolution inputs, and the local Mamba branch focuses more on short-range details in high-resolution reference images. We also use the deform block to adaptively extract features of both branches to enhance the representation ability. A modulator is designed to further enhance deformable features in both global and local Mamba blocks. To fully integrate the reference image for low-resolution image super-resolution, we further develop a multi-modality feature fusion block to adaptively fuse features by considering similarities, differences, and complementary aspects between modalities. In addition, a contrastive edge loss (CELoss) is developed for sufficient enhancement of edge textures and contrast in medical images.

Global and Local Mamba Network for Multi-Modality Medical Image Super-Resolution

TL;DR

The paper addresses the challenge of efficiently achieving high-quality multi-modality medical image super-resolution by balancing global context and local detail. It introduces GLMamba, a two-branch architecture with a global Mamba for low-resolution inputs and a local Mamba for high-resolution references, augmented by deformable and modulatory blocks and a dedicated multi-modality fusion block, coupled with a contrastive edge loss. Empirical results on BraTS2021 and IXI show improvements in PSNR, SSIM, and downstream segmentation Dice, along with competitive parameter efficiency compared with state-of-the-art methods. The approach holds practical potential for faster, more accurate clinical MR SR and downstream tasks, while future work includes extending to 3D data and integrating alignment with SR in a joint framework.

Abstract

Convolutional neural networks and Transformer have made significant progresses in multi-modality medical image super-resolution. However, these methods either have a fixed receptive field for local learning or significant computational burdens for global learning, limiting the super-resolution performance. To solve this problem, State Space Models, notably Mamba, is introduced to efficiently model long-range dependencies in images with linear computational complexity. Relying on the Mamba and the fact that low-resolution images rely on global information to compensate for missing details, while high-resolution reference images need to provide more local details for accurate super-resolution, we propose a global and local Mamba network (GLMamba) for multi-modality medical image super-resolution. To be specific, our GLMamba is a two-branch network equipped with a global Mamba branch and a local Mamba branch. The global Mamba branch captures long-range relationships in low-resolution inputs, and the local Mamba branch focuses more on short-range details in high-resolution reference images. We also use the deform block to adaptively extract features of both branches to enhance the representation ability. A modulator is designed to further enhance deformable features in both global and local Mamba blocks. To fully integrate the reference image for low-resolution image super-resolution, we further develop a multi-modality feature fusion block to adaptively fuse features by considering similarities, differences, and complementary aspects between modalities. In addition, a contrastive edge loss (CELoss) is developed for sufficient enhancement of edge textures and contrast in medical images.

Paper Structure

This paper contains 17 sections, 16 equations, 8 figures, 5 tables.

Figures (8)

  • Figure 1: The overall pipeline of our approach consists of two branches, i.e., global Mamba branch and local Mamba branch. The Deform block can adaptively extract deformable features from both branches. A modulator is introduced to further refine the deformable features within both the global and local Mamba blocks. The multi-modality feature fusion block is designed to fuse two branches. Super-resolved image ($SR$) and reconstructed reference image ($Rec_{Ref}$) are finally obtained through the reconstruct convolution layer.
  • Figure 2: Architecture of the Mamba block (A), deform block (B) and the modulator (C).
  • Figure 3: Architecture of the SS2D for global Mamba (A) and local Mamba (B). $F_{\text{GMamba}}^{lr\uparrow}$ and $F_{\text{LMamba}}^{ref}$ are the output features of SS2D for global and local Mamba, respectively.
  • Figure 4: Architecture of our multi-modality feature fusion block.
  • Figure 5: Qualitative results of super-resolved medical images and error map with and without contrastive edge loss on the BraTS2021 dataset under 4$\times$ unsampling factor.
  • ...and 3 more figures