Table of Contents
Fetching ...

When Mamba Meets xLSTM: An Efficient and Precise Method with the xLSTM-VMUNet Model for Skin lesion Segmentation

Zhuoyi Fang, Jiajia Liu, Kexuan Shi, Qiang Han

Abstract

Automatic melanoma segmentation is essential for early skin cancer detection, yet challenges arise from the heterogeneity of melanoma, as well as interfering factors like blurred boundaries, low contrast, and imaging artifacts. While numerous algorithms have been developed to address these issues, previous approaches have often overlooked the need to jointly capture spatial and sequential features within dermatological images. This limitation hampers segmentation accuracy, especially in cases with indistinct borders or structurally similar lesions. Additionally, previous models lacked both a global receptive field and high computational efficiency. In this work, we present the xLSTM-VMUNet Model, which jointly capture spatial and sequential features within dermatological images successfully. xLSTM-VMUNet can not only specialize in extracting spatial features from images, focusing on the structural characteristics of skin lesions, but also enhance contextual understanding, allowing more effective handling of complex medical image structures. Experiment results on the ISIC2018 dataset demonstrate that xLSTM-VMUNet outperforms VMUNet by 4.85% on DSC and 6.41% on IoU on the ISIC2017 dataset, by 1.25% on DSC and 2.07% on IoU on the ISIC2018 dataset, with faster convergence and consistently high segmentation performance. Our code is available at https://github.com/FangZhuoyi/XLSTM-VMUNet.

When Mamba Meets xLSTM: An Efficient and Precise Method with the xLSTM-VMUNet Model for Skin lesion Segmentation

Abstract

Automatic melanoma segmentation is essential for early skin cancer detection, yet challenges arise from the heterogeneity of melanoma, as well as interfering factors like blurred boundaries, low contrast, and imaging artifacts. While numerous algorithms have been developed to address these issues, previous approaches have often overlooked the need to jointly capture spatial and sequential features within dermatological images. This limitation hampers segmentation accuracy, especially in cases with indistinct borders or structurally similar lesions. Additionally, previous models lacked both a global receptive field and high computational efficiency. In this work, we present the xLSTM-VMUNet Model, which jointly capture spatial and sequential features within dermatological images successfully. xLSTM-VMUNet can not only specialize in extracting spatial features from images, focusing on the structural characteristics of skin lesions, but also enhance contextual understanding, allowing more effective handling of complex medical image structures. Experiment results on the ISIC2018 dataset demonstrate that xLSTM-VMUNet outperforms VMUNet by 4.85% on DSC and 6.41% on IoU on the ISIC2017 dataset, by 1.25% on DSC and 2.07% on IoU on the ISIC2018 dataset, with faster convergence and consistently high segmentation performance. Our code is available at https://github.com/FangZhuoyi/XLSTM-VMUNet.

Paper Structure

This paper contains 22 sections, 18 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The Architecture Overview of XLSTM-VMUNet.
  • Figure 2: (a) The scan expanding operation in SS2D. (b) The scan merging operation in SS2D.
  • Figure 3: Comparison of Dice values across different models on the ISIC2017 and ISIC2018 datasets.
  • Figure 4: Visualized segmentation results of the proposed xLSTM-VMUNet against other state-of-the-arts over the ISIC2018 dataset.