Table of Contents
Fetching ...

VFGS-Net: Frequency-Guided State-Space Learning for Topology-Preserving Retinal Vessel Segmentation

Ruiqi Song, Lei Liu, Ya-Nan Zhang, Chao Wang, Xiaoning Li, Nan Mu

TL;DR

Retinal vessel segmentation is challenged by elongated, multi-scale vessels and low contrast. VFGS-Net integrates three components—Dual-Path Feature Convolution for local and contextual features, Vessel-aware Frequency-domain Channel Attention to emphasize vessel-relevant spectral components, and Bidirectional Asymmetric Mamba2 for efficient global spatial modeling—to jointly preserve fine vessels and vascular topology. The method achieves superior Dice scores and reduced boundary errors across DRIVE, HRF, CHASE_DB1, and STARE, with ablation confirming complementary gains from each module. This approach offers a robust, end-to-end solution with strong potential for clinical deployment in vascular disease screening and morphology analysis, thanks to improved vessel continuity and cross-scale coherence. All mathematical relationships used include $\mathcal{L}_{\mathrm{seg}} = \mathcal{L}_{\mathrm{BCE}} + \mathcal{L}_{\mathrm{Dice}}$ and the spectral attention mechanisms applied in the frequency domain, highlighting the fusion of spatial and spectral reasoning in retinal image analysis.

Abstract

Accurate retinal vessel segmentation is a critical prerequisite for quantitative analysis of retinal images and computer-aided diagnosis of vascular diseases such as diabetic retinopathy. However, the elongated morphology, wide scale variation, and low contrast of retinal vessels pose significant challenges for existing methods, making it difficult to simultaneously preserve fine capillaries and maintain global topological continuity. To address these challenges, we propose the Vessel-aware Frequency-domain and Global Spatial modeling Network (VFGS-Net), an end-to-end segmentation framework that seamlessly integrates frequency-aware feature enhancement, dual-path convolutional representation learning, and bidirectional asymmetric spatial state-space modeling within a unified architecture. Specifically, VFGS-Net employs a dual-path feature convolution module to jointly capture fine-grained local textures and multi-scale contextual semantics. A novel vessel-aware frequency-domain channel attention mechanism is introduced to adaptively reweight spectral components, thereby enhancing vessel-relevant responses in high-level features. Furthermore, at the network bottleneck, we propose a bidirectional asymmetric Mamba2-based spatial modeling block to efficiently capture long-range spatial dependencies and strengthen the global continuity of vascular structures. Extensive experiments on four publicly available retinal vessel datasets demonstrate that VFGS-Net achieves competitive or superior performance compared to state-of-the-art methods. Notably, our model consistently improves segmentation accuracy for fine vessels, complex branching patterns, and low-contrast regions, highlighting its robustness and clinical potential.

VFGS-Net: Frequency-Guided State-Space Learning for Topology-Preserving Retinal Vessel Segmentation

TL;DR

Retinal vessel segmentation is challenged by elongated, multi-scale vessels and low contrast. VFGS-Net integrates three components—Dual-Path Feature Convolution for local and contextual features, Vessel-aware Frequency-domain Channel Attention to emphasize vessel-relevant spectral components, and Bidirectional Asymmetric Mamba2 for efficient global spatial modeling—to jointly preserve fine vessels and vascular topology. The method achieves superior Dice scores and reduced boundary errors across DRIVE, HRF, CHASE_DB1, and STARE, with ablation confirming complementary gains from each module. This approach offers a robust, end-to-end solution with strong potential for clinical deployment in vascular disease screening and morphology analysis, thanks to improved vessel continuity and cross-scale coherence. All mathematical relationships used include and the spectral attention mechanisms applied in the frequency domain, highlighting the fusion of spatial and spectral reasoning in retinal image analysis.

Abstract

Accurate retinal vessel segmentation is a critical prerequisite for quantitative analysis of retinal images and computer-aided diagnosis of vascular diseases such as diabetic retinopathy. However, the elongated morphology, wide scale variation, and low contrast of retinal vessels pose significant challenges for existing methods, making it difficult to simultaneously preserve fine capillaries and maintain global topological continuity. To address these challenges, we propose the Vessel-aware Frequency-domain and Global Spatial modeling Network (VFGS-Net), an end-to-end segmentation framework that seamlessly integrates frequency-aware feature enhancement, dual-path convolutional representation learning, and bidirectional asymmetric spatial state-space modeling within a unified architecture. Specifically, VFGS-Net employs a dual-path feature convolution module to jointly capture fine-grained local textures and multi-scale contextual semantics. A novel vessel-aware frequency-domain channel attention mechanism is introduced to adaptively reweight spectral components, thereby enhancing vessel-relevant responses in high-level features. Furthermore, at the network bottleneck, we propose a bidirectional asymmetric Mamba2-based spatial modeling block to efficiently capture long-range spatial dependencies and strengthen the global continuity of vascular structures. Extensive experiments on four publicly available retinal vessel datasets demonstrate that VFGS-Net achieves competitive or superior performance compared to state-of-the-art methods. Notably, our model consistently improves segmentation accuracy for fine vessels, complex branching patterns, and low-contrast regions, highlighting its robustness and clinical potential.
Paper Structure (19 sections, 19 equations, 9 figures, 2 tables)

This paper contains 19 sections, 19 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Representative failure cases of baseline retinal vessel segmentation models. (a) Input images; (b) Ground truth; (c-d) Results of TransUNet chen2021transunet and VM-UNet ruan2024vm, showing missed or fragmented fine vessels (highlighted); (e) Result of the proposed VFGS-Net with improved vessel continuity.
  • Figure 2: A schematic overview of the proposed VFGS-Net. The architecture consists of an encoder-decoder backbone with embedded modules (i.e., DFC, VFCA, and BA-Mamba2) designed to improve semantic representation and structural integrity.
  • Figure 3: Architecture of the proposed VFCA module, which performs channel-wise attention in the frequency domain via FFT-based global magnitude pooling and adaptive frequency reweighting.
  • Figure 4: Comparative Grad-CAM selvaraju2017grad visualizations of standard skip connections and the proposed VFCA module. (a) Input image. (b-c) Heatmaps generated by standard skip connections and VFCA module, respectively. Red indicates strong activation and blue indicates weak activation.
  • Figure 5: Architecture of the Mamba2 module.
  • ...and 4 more figures