Table of Contents
Fetching ...

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

Zhaohu Xing, Tian Ye, Yijun Yang, Guang Liu, Lei Zhu

TL;DR

SegMamba addresses the challenge of modeling long-range dependencies in 3D medical image segmentation by leveraging Mamba-based state-space modeling within a U-Net–like architecture. It introduces three novel components—Tri-orientated Mamba (ToM) for multi-directional global context, Gated Spatial Convolution (GSC) for spatial refinement, and Feature-level Uncertainty Estimation (FUE) for robust skip-feature fusion—along with a new CRC-500 dataset. Across BraTS2023, AIIB2023, and CRC-500, SegMamba achieves state-of-the-art Dice and HD95 while maintaining favorable memory and speed, outperforming transformer- and CNN-based baselines. The work demonstrates that Mamba-based global modeling can provide competitive or superior segmentation performance with improved efficiency in 3D medical imaging tasks.

Abstract

The Transformer architecture has shown a remarkable ability in modeling global relationships. However, it poses a significant computational challenge when processing high-dimensional medical images. This hinders its development and widespread adoption in this task. Mamba, as a State Space Model (SSM), recently emerged as a notable manner for long-range dependencies in sequential modeling, excelling in natural language processing filed with its remarkable memory efficiency and computational speed. Inspired by its success, we introduce SegMamba, a novel 3D medical image \textbf{Seg}mentation \textbf{Mamba} model, designed to effectively capture long-range dependencies within whole volume features at every scale. Our SegMamba, in contrast to Transformer-based methods, excels in whole volume feature modeling from a state space model standpoint, maintaining superior processing speed, even with volume features at a resolution of {$64\times 64\times 64$}. Comprehensive experiments on the BraTS2023 dataset demonstrate the effectiveness and efficiency of our SegMamba. The code for SegMamba is available at: https://github.com/ge-xing/SegMamba

SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation

TL;DR

SegMamba addresses the challenge of modeling long-range dependencies in 3D medical image segmentation by leveraging Mamba-based state-space modeling within a U-Net–like architecture. It introduces three novel components—Tri-orientated Mamba (ToM) for multi-directional global context, Gated Spatial Convolution (GSC) for spatial refinement, and Feature-level Uncertainty Estimation (FUE) for robust skip-feature fusion—along with a new CRC-500 dataset. Across BraTS2023, AIIB2023, and CRC-500, SegMamba achieves state-of-the-art Dice and HD95 while maintaining favorable memory and speed, outperforming transformer- and CNN-based baselines. The work demonstrates that Mamba-based global modeling can provide competitive or superior segmentation performance with improved efficiency in 3D medical imaging tasks.

Abstract

The Transformer architecture has shown a remarkable ability in modeling global relationships. However, it poses a significant computational challenge when processing high-dimensional medical images. This hinders its development and widespread adoption in this task. Mamba, as a State Space Model (SSM), recently emerged as a notable manner for long-range dependencies in sequential modeling, excelling in natural language processing filed with its remarkable memory efficiency and computational speed. Inspired by its success, we introduce SegMamba, a novel 3D medical image \textbf{Seg}mentation \textbf{Mamba} model, designed to effectively capture long-range dependencies within whole volume features at every scale. Our SegMamba, in contrast to Transformer-based methods, excels in whole volume feature modeling from a state space model standpoint, maintaining superior processing speed, even with volume features at a resolution of {}. Comprehensive experiments on the BraTS2023 dataset demonstrate the effectiveness and efficiency of our SegMamba. The code for SegMamba is available at: https://github.com/ge-xing/SegMamba
Paper Structure (12 sections, 4 equations, 4 figures, 5 tables)

This paper contains 12 sections, 4 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: An overview of the proposed SegMamba. The encoder comprises a stem layer and multiple TSMamba blocks designed to extract multi-scale features. Within each TSMamba block, a gated spatial convolution (GSC) module models the spatial features, and a tri-orientated Mamba (ToM) module represents global information from various directions. Furthermore, we develop a feature-level uncertainty estimation (FUE) module to filter multi-scale features, facilitating more robust feature reuse.
  • Figure 2: (a) The gated spatial convolution. (b) The tri-orientated Mamba.
  • Figure 3: The data visualization for CRC-500 dataset.
  • Figure 4: Visual comparisons of proposed SegMamba and other state-of-the-art methods. Swin denotes SwinUNETR and Swinv2 denotes SwinUNETR-V2.