Uncertainty-guided annotation enhances segmentation with the human-in-the-loop

Nadieh Khalili; Joey Spronck; Francesco Ciompi; Jeroen van der Laak; Geert Litjens

Uncertainty-guided annotation enhances segmentation with the human-in-the-loop

Nadieh Khalili, Joey Spronck, Francesco Ciompi, Jeroen van der Laak, Geert Litjens

TL;DR

This work tackles domain shift and opacity in deep learning for pathology segmentation by introducing Uncertainty-Guided Annotation (UGA), a human-in-the-loop framework. UGA uses ensembles of pathology-adapted nnU-Net to produce per-pixel uncertainty maps and selects high-uncertainty patches for clinician correction, retraining the model with combined central and local data. On Camelyon16/17 datasets, UGA improves Dice coefficient from a baseline of 0.66 to 0.76 with 5 patches and 0.84 with 10 patches, outperforming random sampling. The approach supports privacy-preserving collaboration and aligns with federated learning, offering a scalable path to robust, clinician-guided continual learning in distributed clinical environments.

Abstract

Deep learning algorithms, often critiqued for their 'black box' nature, traditionally fall short in providing the necessary transparency for trusted clinical use. This challenge is particularly evident when such models are deployed in local hospitals, encountering out-of-domain distributions due to varying imaging techniques and patient-specific pathologies. Yet, this limitation offers a unique avenue for continual learning. The Uncertainty-Guided Annotation (UGA) framework introduces a human-in-the-loop approach, enabling AI to convey its uncertainties to clinicians, effectively acting as an automated quality control mechanism. UGA eases this interaction by quantifying uncertainty at the pixel level, thereby revealing the model's limitations and opening the door for clinician-guided corrections. We evaluated UGA on the Camelyon dataset for lymph node metastasis segmentation which revealed that UGA improved the Dice coefficient (DC), from 0.66 to 0.76 by adding 5 patches, and further to 0.84 with 10 patches. To foster broader application and community contribution, we have made our code accessible at

Uncertainty-guided annotation enhances segmentation with the human-in-the-loop

TL;DR

Abstract

Paper Structure (5 sections, 4 figures)

This paper contains 5 sections, 4 figures.

Introduction
Data
Uncertainty-guided annotation sampling
Experiments and Results
Conclusion

Figures (4)

Figure 1: The collaborative segmentation method using UGA is illustrated. The segmentation model trained on the central dataset (RUMC) is applied to five different centers including RUMC, UMCU, IPON, CWZ and RST. The network quantifies the uncertainty per patch using ensembles of nnU-Net and is sorted in descending order from the most uncertain to the least uncertain patches. The human-in-the-loop process reviews the cases with the highest uncertainty (only 5 patches) and corrects segmentation. The new version of the model is trained on a combination of central and local data.
Figure 2: Overview of the color variation observed among different centers in the RUMC, UMCU, RST, CWZ, and LabPON datasets.
Figure 3: The model trained solely on RUMC data, is applied to datasets from five different centers. The graph showcases both aggregated patch-level uncertainty values and corresponding DC.
Figure 4: The performance of the segmentation models was quantified using DC and compared across three training strategies: the UGA approach, random sampling, and a baseline model trained exclusively on data from RUMC. (a) Segmentation performanc of the baseline model, continue training on additional training on 5 and 10 patches. (b) Segmentation performance across RST, RUMC, UMCU, CWZ and Lpon centers. (c) Segmentation performance for different cancer types, including negative samples, isolated tumor cells (ITC), micro-metastases, macro-metastases.

Uncertainty-guided annotation enhances segmentation with the human-in-the-loop

TL;DR

Abstract

Uncertainty-guided annotation enhances segmentation with the human-in-the-loop

Authors

TL;DR

Abstract

Table of Contents

Figures (4)