VFGS-Net: Frequency-Guided State-Space Learning for Topology-Preserving Retinal Vessel Segmentation
Ruiqi Song, Lei Liu, Ya-Nan Zhang, Chao Wang, Xiaoning Li, Nan Mu
TL;DR
Retinal vessel segmentation is challenged by elongated, multi-scale vessels and low contrast. VFGS-Net integrates three components—Dual-Path Feature Convolution for local and contextual features, Vessel-aware Frequency-domain Channel Attention to emphasize vessel-relevant spectral components, and Bidirectional Asymmetric Mamba2 for efficient global spatial modeling—to jointly preserve fine vessels and vascular topology. The method achieves superior Dice scores and reduced boundary errors across DRIVE, HRF, CHASE_DB1, and STARE, with ablation confirming complementary gains from each module. This approach offers a robust, end-to-end solution with strong potential for clinical deployment in vascular disease screening and morphology analysis, thanks to improved vessel continuity and cross-scale coherence. All mathematical relationships used include $\mathcal{L}_{\mathrm{seg}} = \mathcal{L}_{\mathrm{BCE}} + \mathcal{L}_{\mathrm{Dice}}$ and the spectral attention mechanisms applied in the frequency domain, highlighting the fusion of spatial and spectral reasoning in retinal image analysis.
Abstract
Accurate retinal vessel segmentation is a critical prerequisite for quantitative analysis of retinal images and computer-aided diagnosis of vascular diseases such as diabetic retinopathy. However, the elongated morphology, wide scale variation, and low contrast of retinal vessels pose significant challenges for existing methods, making it difficult to simultaneously preserve fine capillaries and maintain global topological continuity. To address these challenges, we propose the Vessel-aware Frequency-domain and Global Spatial modeling Network (VFGS-Net), an end-to-end segmentation framework that seamlessly integrates frequency-aware feature enhancement, dual-path convolutional representation learning, and bidirectional asymmetric spatial state-space modeling within a unified architecture. Specifically, VFGS-Net employs a dual-path feature convolution module to jointly capture fine-grained local textures and multi-scale contextual semantics. A novel vessel-aware frequency-domain channel attention mechanism is introduced to adaptively reweight spectral components, thereby enhancing vessel-relevant responses in high-level features. Furthermore, at the network bottleneck, we propose a bidirectional asymmetric Mamba2-based spatial modeling block to efficiently capture long-range spatial dependencies and strengthen the global continuity of vascular structures. Extensive experiments on four publicly available retinal vessel datasets demonstrate that VFGS-Net achieves competitive or superior performance compared to state-of-the-art methods. Notably, our model consistently improves segmentation accuracy for fine vessels, complex branching patterns, and low-contrast regions, highlighting its robustness and clinical potential.
