Table of Contents
Fetching ...

MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation

Aaron Cao, Zongyu Li, Jordan Jomsky, Andrew F. Laine, Jia Guo

TL;DR

A 3D patch-based hybrid CNN-Mamba model that leverages Mamba's selective scan algorithm, thereby enhancing segmentation accuracy and efficiency for 3D inputs is developed, demonstrating potential advantages over existing methods.

Abstract

Widely used traditional pipelines for subcortical brain segmentation are often inefficient and slow, particularly when processing large datasets. Furthermore, deep learning models face challenges due to the high resolution of MRI images and the large number of anatomical classes involved. To address these limitations, we developed a 3D patch-based hybrid CNN-Mamba model that leverages Mamba's selective scan algorithm, thereby enhancing segmentation accuracy and efficiency for 3D inputs. This retrospective study utilized 1784 T1-weighted MRI scans from a diverse, multi-site dataset of healthy individuals. The dataset was divided into training, validation, and testing sets with a 1076/345/363 split. The scans were obtained from 1.5T and 3T MRI machines. Our model's performance was validated against several benchmarks, including other CNN-Mamba, CNN-Transformer, and pure CNN networks, using FreeSurfer-generated ground truths. We employed the Dice Similarity Coefficient (DSC), Volume Similarity (VS), and Average Symmetric Surface Distance (ASSD) as evaluation metrics. Statistical significance was determined using the Wilcoxon signed-rank test with a threshold of P < 0.05. The proposed model achieved the highest overall performance across all metrics (DSC 0.88383; VS 0.97076; ASSD 0.33604), significantly outperforming all non-Mamba-based models (P < 0.001). While the model did not show significant improvement in DSC or VS compared to another Mamba-based model (P-values of 0.114 and 0.425), it demonstrated a significant enhancement in ASSD (P < 0.001) with approximately 20% fewer parameters. In conclusion, our proposed hybrid CNN-Mamba architecture offers an efficient and accurate approach for 3D subcortical brain segmentation, demonstrating potential advantages over existing methods. Code is available at: https://github.com/aaroncao06/MedSegMamba.

MedSegMamba: 3D CNN-Mamba Hybrid Architecture for Brain Segmentation

TL;DR

A 3D patch-based hybrid CNN-Mamba model that leverages Mamba's selective scan algorithm, thereby enhancing segmentation accuracy and efficiency for 3D inputs is developed, demonstrating potential advantages over existing methods.

Abstract

Widely used traditional pipelines for subcortical brain segmentation are often inefficient and slow, particularly when processing large datasets. Furthermore, deep learning models face challenges due to the high resolution of MRI images and the large number of anatomical classes involved. To address these limitations, we developed a 3D patch-based hybrid CNN-Mamba model that leverages Mamba's selective scan algorithm, thereby enhancing segmentation accuracy and efficiency for 3D inputs. This retrospective study utilized 1784 T1-weighted MRI scans from a diverse, multi-site dataset of healthy individuals. The dataset was divided into training, validation, and testing sets with a 1076/345/363 split. The scans were obtained from 1.5T and 3T MRI machines. Our model's performance was validated against several benchmarks, including other CNN-Mamba, CNN-Transformer, and pure CNN networks, using FreeSurfer-generated ground truths. We employed the Dice Similarity Coefficient (DSC), Volume Similarity (VS), and Average Symmetric Surface Distance (ASSD) as evaluation metrics. Statistical significance was determined using the Wilcoxon signed-rank test with a threshold of P < 0.05. The proposed model achieved the highest overall performance across all metrics (DSC 0.88383; VS 0.97076; ASSD 0.33604), significantly outperforming all non-Mamba-based models (P < 0.001). While the model did not show significant improvement in DSC or VS compared to another Mamba-based model (P-values of 0.114 and 0.425), it demonstrated a significant enhancement in ASSD (P < 0.001) with approximately 20% fewer parameters. In conclusion, our proposed hybrid CNN-Mamba architecture offers an efficient and accurate approach for 3D subcortical brain segmentation, demonstrating potential advantages over existing methods. Code is available at: https://github.com/aaroncao06/MedSegMamba.
Paper Structure (17 sections, 8 figures, 2 tables)

This paper contains 17 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: (a) The subcortical segmentation pipeline extracts 3D patches from the input scan, feeds them into the model, and reconstructs the output predicted label maps to generate the subcortical segmentation of the input scan. (b) The hippocampal subfield segmentation pipeline uses the subcortical segmentation label map to extract one patch centered on the hippocampus region. This patch is fed into the model and the output is padded to the original shape.
  • Figure 2: (a) The SS3D module encodes 8 sequences from the input volume, processes each with an independent S6 block, and then merges the outputs. (b) VSS3D block layout, consisting of SS3D and MLP residual modules. (c) The MedSegMamba model architecture has an encoder-decoder structure with a hybrid of CNN and Mamba-based blocks. Only 6 VSS3D layers are shown here, but the model contains 9 in total. The last Conv1x1 and Softmax layers are also not shown. (d) The SS3D modules unravel the sequences along this continuous 3D scanning pattern.
  • Figure 3: (a) Age and Gender Distributions for the Subcortical Segmentation Dataset (b) Age and Gender Distributions for the Hippocampal Subfield Segmentation Dataset.
  • Figure 4: 2D slices of a sample’s subcortical segmentation by each method.
  • Figure 5: 3D renderings of a sample’s subcortical segmentation by each method.
  • ...and 3 more figures