Cross-Dataset Generalization For Retinal Lesions Segmentation
Clément Playout, Farida Cheriet
TL;DR
This work tackles cross-dataset generalization for retinal lesion segmentation by characterizing multiple public fundus datasets and assessing how annotation styles (coarse vs fine vs mixed) affect generalization. It trains a strong UNet-based segmentation model and systematically evaluates 31 dataset-combination configurations, uncovering that incorporating coarsely labeled data can boost performance on finely labeled test sets, while coarse-only training can hurt accuracy. The study also evaluates generalization techniques—ensembles, Stochastic Weight Averaging, and model soups—finding ensembles most consistently helpful but computationally expensive, and that SWA/model soups offer limited, inconsistent gains in segmentation. Overall, the work highlights practical strategies to leverage heterogeneous datasets to improve retinal lesion segmentation while outlining avenues for future improvement via noisy-label learning and domain adaptation.
Abstract
Identifying lesions in fundus images is an important milestone toward an automated and interpretable diagnosis of retinal diseases. To support research in this direction, multiple datasets have been released, proposing groundtruth maps for different lesions. However, important discrepancies exist between the annotations and raise the question of generalization across datasets. This study characterizes several known datasets and compares different techniques that have been proposed to enhance the generalisation performance of a model, such as stochastic weight averaging, model soups and ensembles. Our results provide insights into how to combine coarsely labelled data with a finely-grained dataset in order to improve the lesions segmentation.
