Table of Contents
Fetching ...

MambaU-Lite: A Lightweight Model based on Mamba and Integrated Channel-Spatial Attention for Skin Lesion Segmentation

Thi-Nhu-Quynh Nguyen, Quang-Huy Ho, Duy-Thai Nguyen, Hoang-Minh-Quang Le, Van-Truong Pham, Thi-Thao Tran

TL;DR

The paper addresses the challenge of accurate skin lesion segmentation under tight resource constraints by introducing MambaU-Lite, a lightweight hybrid model that combines Mamba-based visual state-space concepts with CNN elements. It presents a U-shaped encoder–bottleneck–decoder architecture featuring the P-Mamba block to capture multiscale features, CBAM-based skip connections, and an Integrated Channel-Spatial Attention bottleneck (ICSA) for high-level feature fusion. Evaluations on ISIC 2018 and PH2 show state-of-the-art or competitive segmentation accuracy with a tiny parameter footprint (~0.42M) and low memory usage (~1.67MB), along with reasonable FLOPs (~1.25G). The results indicate strong potential for real-time, low-resource deployment in dermoscopy devices and medical workflows, with future work aimed at broader generalization and deployment considerations.

Abstract

Early detection of skin abnormalities plays a crucial role in diagnosing and treating skin cancer. Segmentation of affected skin regions using AI-powered devices is relatively common and supports the diagnostic process. However, achieving high performance remains a significant challenge due to the need for high-resolution images and the often unclear boundaries of individual lesions. At the same time, medical devices require segmentation models to have a small memory foot-print and low computational cost. Based on these requirements, we introduce a novel lightweight model called MambaU-Lite, which combines the strengths of Mamba and CNN architectures, featuring just over 400K parameters and a computational cost of more than 1G flops. To enhance both global context and local feature extraction, we propose the P-Mamba block, a novel component that incorporates VSS blocks along-side multiple pooling layers, enabling the model to effectively learn multiscale features and enhance segmentation performance. We evaluate the model's performance on two skin datasets, ISIC2018 and PH2, yielding promising results. Our source code will be made publicly available at: https://github.com/nqnguyen812/MambaU-Lite.

MambaU-Lite: A Lightweight Model based on Mamba and Integrated Channel-Spatial Attention for Skin Lesion Segmentation

TL;DR

The paper addresses the challenge of accurate skin lesion segmentation under tight resource constraints by introducing MambaU-Lite, a lightweight hybrid model that combines Mamba-based visual state-space concepts with CNN elements. It presents a U-shaped encoder–bottleneck–decoder architecture featuring the P-Mamba block to capture multiscale features, CBAM-based skip connections, and an Integrated Channel-Spatial Attention bottleneck (ICSA) for high-level feature fusion. Evaluations on ISIC 2018 and PH2 show state-of-the-art or competitive segmentation accuracy with a tiny parameter footprint (~0.42M) and low memory usage (~1.67MB), along with reasonable FLOPs (~1.25G). The results indicate strong potential for real-time, low-resource deployment in dermoscopy devices and medical workflows, with future work aimed at broader generalization and deployment considerations.

Abstract

Early detection of skin abnormalities plays a crucial role in diagnosing and treating skin cancer. Segmentation of affected skin regions using AI-powered devices is relatively common and supports the diagnostic process. However, achieving high performance remains a significant challenge due to the need for high-resolution images and the often unclear boundaries of individual lesions. At the same time, medical devices require segmentation models to have a small memory foot-print and low computational cost. Based on these requirements, we introduce a novel lightweight model called MambaU-Lite, which combines the strengths of Mamba and CNN architectures, featuring just over 400K parameters and a computational cost of more than 1G flops. To enhance both global context and local feature extraction, we propose the P-Mamba block, a novel component that incorporates VSS blocks along-side multiple pooling layers, enabling the model to effectively learn multiscale features and enhance segmentation performance. We evaluate the model's performance on two skin datasets, ISIC2018 and PH2, yielding promising results. Our source code will be made publicly available at: https://github.com/nqnguyen812/MambaU-Lite.

Paper Structure

This paper contains 12 sections, 3 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The architecture of the proposed MambaU-Lite model
  • Figure 2: The main components' architectures of the proposed MambaU-Lite model
  • Figure 3: Representative segmentation on the ISIC2018 and PH2 datasets.The ground truths are shown in blue, and the predictions are displayed in green.