HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation
Ziyuan Ding, Yixiong Liang, Shichao Kan, Qing Liu
TL;DR
The paper tackles precise segmentation of tiny fundus lesions under high-resolution constraints that typically incur high memory usage and slow inference. It introduces HRDecoder, a simple high-resolution decoder framework with two components: an HR representation learning module that crops and learns from high-resolution detail, and an HR fusion module that combines global LR predictions with detailed HR predictions in a parameter-free way. Training combines a segmentation loss with an HR-specific Dice loss, L = L_Seg + λL_HR, where λ is set to 0.1, enabling the model to learn fine-grained local features while preserving context. Evaluations on IDRiD and DDR demonstrate state-of-the-art or competitive performance with favorable memory and speed characteristics, suggesting practical applicability for clinical fundus image analysis.
Abstract
High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global fusion methods. These methods preserve fine details using local regions and capture long-range context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, adversely affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus lesion segmentation. It integrates a high-resolution representation learning module to capture fine-grained local features and a high-resolution fusion module to fuse multi-scale predictions. Our method effectively improves the overall segmentation accuracy of fundus lesions while consuming reasonable memory and computational overhead, and maintaining satisfying inference speed. Experimental results on the IDRiD and DDR datasets demonstrate the effectiveness of our method. Code is available at https://github.com/CVIU-CSU/HRDecoder.
