Table of Contents
Fetching ...

Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space Model

Hongqiu Wang, Yixian Chen, Wu Chen, Huihui Xu, Haoyu Zhao, Bin Sheng, Huazhu Fu, Guang Yang, Lei Zhu

TL;DR

The Serpentine Mamba (Serp-Mamba) network is proposed to address the category imbalance problem intensified by high-resolution UWF images, and delineates pixels by two learnable thresholds and refines ambiguous pixels through a dual-driven strategy.

Abstract

Ultra-Wide-Field Scanning Laser Ophthalmoscopy (UWF-SLO) images capture high-resolution views of the retina with typically 200 spanning degrees. Accurate segmentation of vessels in UWF-SLO images is essential for detecting and diagnosing fundus disease. Recent studies have revealed that the selective State Space Model (SSM) in Mamba performs well in modeling long-range dependencies, which is crucial for capturing the continuity of elongated vessel structures. Inspired by this, we propose the first Serpentine Mamba (Serp-Mamba) network to address this challenging task. Specifically, we recognize the intricate, varied, and delicate nature of the tubular structure of vessels. Furthermore, the high-resolution of UWF-SLO images exacerbates the imbalance between the vessel and background categories. Based on the above observations, we first devise a Serpentine Interwoven Adaptive (SIA) scan mechanism, which scans UWF-SLO images along curved vessel structures in a snake-like crawling manner. This approach, consistent with vascular texture transformations, ensures the effective and continuous capture of curved vascular structure features. Second, we propose an Ambiguity-Driven Dual Recalibration (ADDR) module to address the category imbalance problem intensified by high-resolution images. Our ADDR module delineates pixels by two learnable thresholds and refines ambiguous pixels through a dual-driven strategy, thereby accurately distinguishing vessels and background regions. Experiment results on three datasets demonstrate the superior performance of our Serp-Mamba on high-resolution vessel segmentation. We also conduct a series of ablation studies to verify the impact of our designs. Our code shall be released upon publication of this work.

Serp-Mamba: Advancing High-Resolution Retinal Vessel Segmentation with Selective State-Space Model

TL;DR

The Serpentine Mamba (Serp-Mamba) network is proposed to address the category imbalance problem intensified by high-resolution UWF images, and delineates pixels by two learnable thresholds and refines ambiguous pixels through a dual-driven strategy.

Abstract

Ultra-Wide-Field Scanning Laser Ophthalmoscopy (UWF-SLO) images capture high-resolution views of the retina with typically 200 spanning degrees. Accurate segmentation of vessels in UWF-SLO images is essential for detecting and diagnosing fundus disease. Recent studies have revealed that the selective State Space Model (SSM) in Mamba performs well in modeling long-range dependencies, which is crucial for capturing the continuity of elongated vessel structures. Inspired by this, we propose the first Serpentine Mamba (Serp-Mamba) network to address this challenging task. Specifically, we recognize the intricate, varied, and delicate nature of the tubular structure of vessels. Furthermore, the high-resolution of UWF-SLO images exacerbates the imbalance between the vessel and background categories. Based on the above observations, we first devise a Serpentine Interwoven Adaptive (SIA) scan mechanism, which scans UWF-SLO images along curved vessel structures in a snake-like crawling manner. This approach, consistent with vascular texture transformations, ensures the effective and continuous capture of curved vascular structure features. Second, we propose an Ambiguity-Driven Dual Recalibration (ADDR) module to address the category imbalance problem intensified by high-resolution images. Our ADDR module delineates pixels by two learnable thresholds and refines ambiguous pixels through a dual-driven strategy, thereby accurately distinguishing vessels and background regions. Experiment results on three datasets demonstrate the superior performance of our Serp-Mamba on high-resolution vessel segmentation. We also conduct a series of ablation studies to verify the impact of our designs. Our code shall be released upon publication of this work.
Paper Structure (14 sections, 11 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 14 sections, 11 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: (a) Comparison of FOV between ultra-widefield photograph and narrow field photograph; (b) Comparison of fixed scan path and our learnable serpentine scan path for Mamba Block.
  • Figure 2: Overview of the Serp-Mamba. The top of the figure shows the whole model. Our proposed method adopts a classical U-shaped structure which includes the novel Serpentine Interwoven Adaptive (SIA) scan mechanism and the Ambiguity-Driven Dual Recalibration (ADDR) module. As shown in the lower-left corner of the figure, in SIA, the input features will be scanned by the Serpentine Mamba scan and the ordinary Mamba scan in the $\mathcal{X}$ and $\mathcal{Y}$ directions respectively, and then combined. For details of the SIA mechanism, please refer to Fig. \ref{['fig: explain of SIA']}. In the ADDR module in the lower right corner, the pixels of the input feature are first distinguished by two learnable thresholds, and then the ambiguous pixels are refined through continuity perception and dual driving of blood vessels and background pixels. For details of the ADDR module, please refer to Fig. \ref{['fig: addr']}.
  • Figure 3: Illustration of the proposed serpentine scan mechanism along the $\mathcal{X}$ and $\mathcal{Y}$ axes. The upper and lower left of the figure shows two patches of $F_{sia}$. i) Take the $\mathcal{X}$ direction (the first row) for instance, we set the serpentine scan path length $l$ ($l$=7 here). We sample pixels at fixed intervals along the $\mathcal{X}$ direction as center points (purple boxes). ii) The learnable serpentine scan modifies these paths (yellow boxes) through deformation learning based on these center points. iii) All serpentine scan paths are mapped to straight lines and concatenated into a horizontal sequence $S_{x}$, which is then fed into the Mamba block to learn deep contextual information. iv) Finally, the output sequence is reshaped back to the original input feature map size to obtain the final output. The same process applies in the $\mathcal{Y}$ direction (the second row).
  • Figure 4: Our Ambiguity-Driven Dual Recalibration module consists of four steps: i) The two learnable Thresholds $T_{ba}$ (for background) and $T_{ve}$ (for vessels) used in the scanning process are first determined and continuously updated in subsequent training. ii) The ambiguous pixels are then processed through the Continuity Perception strategy. iii) Next, three vectors $V_{ve}, V_{am}, V_{ba}$ are obtained by applying the thresholds to distinguish vessels, ambiguous regions, and background, after which the Vessel-Driven and Background-Driven are performed. iv) Finally, the weight $\mathcal{W}_{dri}$ is utilized to enhance results, and the Mamba block is employed to complete the scanning process.
  • Figure 5: The figure shows the results of the ablation experiments with or without ADDR, where red represents the correct segmentation results, green represents the ground truth, and the arrows point to the noteworthy locations, please zoom in to check the details. Left: Results of the model without ADDR module. Right: Results of the model with ADDR module.
  • ...and 3 more figures