Image-level Regression for Uncertainty-aware Retinal Image Segmentation
Trung Dang, Huy Hoang Nguyen, Aleksei Tiulpin
TL;DR
This work addresses retinal vessel segmentation under annotator uncertainty by reframing the task as image-level regression. It introduces SAUNA, a transform that converts hard ground-truth masks into signed soft labels by incorporating boundary proximity and vessel thickness, enabling uncertainty-aware learning without multiple annotations. The method combines a generalized Jaccard Metric Loss operating on a [-1,1]^D domain with a stable Focal-L1 loss for pixel-level regression, yielding strong performance with LR inputs while maintaining high efficiency compared to HR baselines. Across five retinal datasets, the approach improves IoU, Dice, and accuracy, and demonstrates superior generalization to external datasets, especially when paired with UNet-like architectures. The results suggest that uncertainty-aware, regression-based segmentation can achieve competitive or superior performance with much greater throughput, aided by the SAUNA transform and robust loss design.
Abstract
Accurate retinal vessel (RV) segmentation is a crucial step in the quantitative assessment of retinal vasculature, which is needed for the early detection of retinal diseases and other conditions. Numerous studies have been conducted to tackle the problem of segmenting vessels automatically using a pixel-wise classification approach. The common practice of creating ground truth labels is to categorize pixels as foreground and background. This approach is, however, biased, and it ignores the uncertainty of a human annotator when it comes to annotating e.g. thin vessels. In this work, we propose a simple and effective method that casts the RV segmentation task as an image-level regression. For this purpose, we first introduce a novel Segmentation Annotation Uncertainty-Aware (SAUNA) transform, which adds pixel uncertainty to the ground truth using the pixel's closeness to the annotation boundary and vessel thickness. To train our model with soft labels, we generalize the earlier proposed Jaccard metric loss to arbitrary hypercubes for soft Jaccard index (Intersection-over-Union) optimization. Additionally, we employ a stable version of the Focal-L1 loss for pixel-wise regression. We conduct thorough experiments and compare our method to a diverse set of baselines across 5 retinal image datasets. Our empirical results indicate that the integration of the SAUNA transform and these segmentation losses led to significant performance boosts for different segmentation models. Particularly, our methodology enables UNet-like architectures to substantially outperform computational-intensive baselines. Our implementation is available at \url{https://github.com/Oulu-IMEDS/SAUNA}.
