Table of Contents
Fetching ...

HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation

Ziyuan Ding, Yixiong Liang, Shichao Kan, Qing Liu

TL;DR

The paper tackles precise segmentation of tiny fundus lesions under high-resolution constraints that typically incur high memory usage and slow inference. It introduces HRDecoder, a simple high-resolution decoder framework with two components: an HR representation learning module that crops and learns from high-resolution detail, and an HR fusion module that combines global LR predictions with detailed HR predictions in a parameter-free way. Training combines a segmentation loss with an HR-specific Dice loss, L = L_Seg + λL_HR, where λ is set to 0.1, enabling the model to learn fine-grained local features while preserving context. Evaluations on IDRiD and DDR demonstrate state-of-the-art or competitive performance with favorable memory and speed characteristics, suggesting practical applicability for clinical fundus image analysis.

Abstract

High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global fusion methods. These methods preserve fine details using local regions and capture long-range context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, adversely affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus lesion segmentation. It integrates a high-resolution representation learning module to capture fine-grained local features and a high-resolution fusion module to fuse multi-scale predictions. Our method effectively improves the overall segmentation accuracy of fundus lesions while consuming reasonable memory and computational overhead, and maintaining satisfying inference speed. Experimental results on the IDRiD and DDR datasets demonstrate the effectiveness of our method. Code is available at https://github.com/CVIU-CSU/HRDecoder.

HRDecoder: High-Resolution Decoder Network for Fundus Image Lesion Segmentation

TL;DR

The paper tackles precise segmentation of tiny fundus lesions under high-resolution constraints that typically incur high memory usage and slow inference. It introduces HRDecoder, a simple high-resolution decoder framework with two components: an HR representation learning module that crops and learns from high-resolution detail, and an HR fusion module that combines global LR predictions with detailed HR predictions in a parameter-free way. Training combines a segmentation loss with an HR-specific Dice loss, L = L_Seg + λL_HR, where λ is set to 0.1, enabling the model to learn fine-grained local features while preserving context. Evaluations on IDRiD and DDR demonstrate state-of-the-art or competitive performance with favorable memory and speed characteristics, suggesting practical applicability for clinical fundus image analysis.

Abstract

High resolution is crucial for precise segmentation in fundus images, yet handling high-resolution inputs incurs considerable GPU memory costs, with diminishing performance gains as overhead increases. To address this issue while tackling the challenge of segmenting tiny objects, recent studies have explored local-global fusion methods. These methods preserve fine details using local regions and capture long-range context information from downscaled global images. However, the necessity of multiple forward passes inevitably incurs significant computational overhead, adversely affecting inference speed. In this paper, we propose HRDecoder, a simple High-Resolution Decoder network for fundus lesion segmentation. It integrates a high-resolution representation learning module to capture fine-grained local features and a high-resolution fusion module to fuse multi-scale predictions. Our method effectively improves the overall segmentation accuracy of fundus lesions while consuming reasonable memory and computational overhead, and maintaining satisfying inference speed. Experimental results on the IDRiD and DDR datasets demonstrate the effectiveness of our method. Code is available at https://github.com/CVIU-CSU/HRDecoder.

Paper Structure

This paper contains 15 sections, 6 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: (a) Our method reaches SOTA performance and is memory efficient, the numbers represent input resolutions. (b) Example of tiny lesions in fundus images. (c) Performance gains on each category.
  • Figure 2: Overview of HRDecoder. (a) Training and testing pipeline. (b) HR representation learning module aims to learn local detailed features from simulated HR feature maps. (c) HR fusion module aggregates multi-scale predictions.
  • Figure 3: Ablation on HR representation learning module. (a) Influence of the number of HR crops $M$. (b) Impact of crop factor $\delta$. (c) Impact of HR loss weight $\lambda$.