Table of Contents
Fetching ...

CAMS: Convolution and Attention-Free Mamba-based Cardiac Image Segmentation

Abbas Khan, Muhammad Asad, Martin Benning, Caroline Roney, Gregory Slabaugh

TL;DR

This model outperforms the existing state-of-the-art CNN, self-attention, and Mamba-based methods on CMR and M&Ms-2 Cardiac segmentation datasets, showing how this innovative, convolution, and self-attention-free method can inspire further research beyond CNN and Transformer paradigms, achieving linear complexity and reducing the number of parameters.

Abstract

Convolutional Neural Networks (CNNs) and Transformer-based self-attention models have become the standard for medical image segmentation. This paper demonstrates that convolution and self-attention, while widely used, are not the only effective methods for segmentation. Breaking with convention, we present a Convolution and self-Attention-free Mamba-based semantic Segmentation Network named CAMS-Net. Specifically, we design Mamba-based Channel Aggregator and Spatial Aggregator, which are applied independently in each encoder-decoder stage. The Channel Aggregator extracts information across different channels, and the Spatial Aggregator learns features across different spatial locations. We also propose a Linearly Interconnected Factorized Mamba (LIFM) block to reduce the computational complexity of a Mamba block and to enhance its decision function by introducing a non-linearity between two factorized Mamba blocks. Our model outperforms the existing state-of-the-art CNN, self-attention, and Mamba-based methods on CMR and M&Ms-2 Cardiac segmentation datasets, showing how this innovative, convolution, and self-attention-free method can inspire further research beyond CNN and Transformer paradigms, achieving linear complexity and reducing the number of parameters. Source code and pre-trained models are available at: https://github.com/kabbas570/CAMS-Net.

CAMS: Convolution and Attention-Free Mamba-based Cardiac Image Segmentation

TL;DR

This model outperforms the existing state-of-the-art CNN, self-attention, and Mamba-based methods on CMR and M&Ms-2 Cardiac segmentation datasets, showing how this innovative, convolution, and self-attention-free method can inspire further research beyond CNN and Transformer paradigms, achieving linear complexity and reducing the number of parameters.

Abstract

Convolutional Neural Networks (CNNs) and Transformer-based self-attention models have become the standard for medical image segmentation. This paper demonstrates that convolution and self-attention, while widely used, are not the only effective methods for segmentation. Breaking with convention, we present a Convolution and self-Attention-free Mamba-based semantic Segmentation Network named CAMS-Net. Specifically, we design Mamba-based Channel Aggregator and Spatial Aggregator, which are applied independently in each encoder-decoder stage. The Channel Aggregator extracts information across different channels, and the Spatial Aggregator learns features across different spatial locations. We also propose a Linearly Interconnected Factorized Mamba (LIFM) block to reduce the computational complexity of a Mamba block and to enhance its decision function by introducing a non-linearity between two factorized Mamba blocks. Our model outperforms the existing state-of-the-art CNN, self-attention, and Mamba-based methods on CMR and M&Ms-2 Cardiac segmentation datasets, showing how this innovative, convolution, and self-attention-free method can inspire further research beyond CNN and Transformer paradigms, achieving linear complexity and reducing the number of parameters. Source code and pre-trained models are available at: https://github.com/kabbas570/CAMS-Net.
Paper Structure (15 sections, 5 equations, 4 figures, 3 tables)

This paper contains 15 sections, 5 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: (a) Overall architecture of proposed CAMS-Net, (b) the Non-Convolutional (NC) Mamba Block without local convolution, (c) comparison of proposed LIFM block with original Mamba, (e) Mamba Channel Aggregator (MCA), (f) Mamba Spatial Aggregator (MSA), and (g) the Channel-Spatial Information Fusion (CS-IF) Module.
  • Figure 2: Qualitative comparison from CMRxSegmentation dataset, using CAMS-Net and other networks, highlighting CAMS-Net's enhanced performance in boundary separation and preserving spatial integrity across different regions. Please zoom in for details.
  • Figure 3: Qualitative comparison of visual results of CAMS-Net and other networks using M&Ms2 dataset. Please zoom in for details.
  • Figure 4: Visual comparison of results from different ablation studies on CMR-segmentation dataset. Please zoom in for details.