Table of Contents
Fetching ...

Bayesian uncertainty-weighted loss for improved generalisability on polyp segmentation task

Rebecca S. Stone, Pedro E. Chavarrias-Solano, Andrew J. Bulpitt, David C. Hogg, Sharib Ali

TL;DR

The paper addresses generalisability and fairness in polyp segmentation across multi-center colonoscopy datasets. It extends a Bayesian bias mitigation framework to semantic segmentation by training a DeepLabV3+ model with a posterior over weights learned via SG-MCMC, computing predictive mean $\mu_i$ and predictive uncertainty $\sigma_i$, and optimizing an uncertainty-weighted loss $\hat{L}(\hat{y}_i, y_i) = L_{CE}(\hat{y}_i, y_i) \cdot (1.0 + \sigma_{i,y_i})^{\kappa}$ to emphasize uncertain regions. The authors adapt this approach to PolypGen and demonstrate that it matches or exceeds state-of-the-art performance while substantially reducing generalisation gaps across unseen centers and modalities, with particular gains on sequence data ($\approx$3% Dice improvement on C6-SEQ). Additionally, uncertainty maps produced during inference offer a potential tool for clinicians to identify challenging cases, supporting fairer and more reliable deployment in practice.

Abstract

While several previous studies have devised methods for segmentation of polyps, most of these methods are not rigorously assessed on multi-center datasets. Variability due to appearance of polyps from one center to another, difference in endoscopic instrument grades, and acquisition quality result in methods with good performance on in-distribution test data, and poor performance on out-of-distribution or underrepresented samples. Unfair models have serious implications and pose a critical challenge to clinical applications. We adapt an implicit bias mitigation method which leverages Bayesian predictive uncertainties during training to encourage the model to focus on underrepresented sample regions. We demonstrate the potential of this approach to improve generalisability without sacrificing state-of-the-art performance on a challenging multi-center polyp segmentation dataset (PolypGen) with different centers and image modalities.

Bayesian uncertainty-weighted loss for improved generalisability on polyp segmentation task

TL;DR

The paper addresses generalisability and fairness in polyp segmentation across multi-center colonoscopy datasets. It extends a Bayesian bias mitigation framework to semantic segmentation by training a DeepLabV3+ model with a posterior over weights learned via SG-MCMC, computing predictive mean and predictive uncertainty , and optimizing an uncertainty-weighted loss to emphasize uncertain regions. The authors adapt this approach to PolypGen and demonstrate that it matches or exceeds state-of-the-art performance while substantially reducing generalisation gaps across unseen centers and modalities, with particular gains on sequence data (3% Dice improvement on C6-SEQ). Additionally, uncertainty maps produced during inference offer a potential tool for clinicians to identify challenging cases, supporting fairer and more reliable deployment in practice.

Abstract

While several previous studies have devised methods for segmentation of polyps, most of these methods are not rigorously assessed on multi-center datasets. Variability due to appearance of polyps from one center to another, difference in endoscopic instrument grades, and acquisition quality result in methods with good performance on in-distribution test data, and poor performance on out-of-distribution or underrepresented samples. Unfair models have serious implications and pose a critical challenge to clinical applications. We adapt an implicit bias mitigation method which leverages Bayesian predictive uncertainties during training to encourage the model to focus on underrepresented sample regions. We demonstrate the potential of this approach to improve generalisability without sacrificing state-of-the-art performance on a challenging multi-center polyp segmentation dataset (PolypGen) with different centers and image modalities.
Paper Structure (7 sections, 4 equations, 3 figures, 1 table)

This paper contains 7 sections, 4 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Pixel-wise weighting of cross entropy (CE) loss contribution based on predictive uncertainty maps for each training sample; the model is encouraged to focus on regions for which it is more uncertain.
  • Figure 2: Samples from the EndoCV2021 dataset; from (top) C1-5 single frames and (bottom) C1-5-SEQ; (top) highlights the data distribution of each center (C1-C5), which consists of curated frames with well-defined polyps; (bottom) demonstrates the variability of sequential data due to the presence of artifacts, occlusions, and polyps with different morphology.
  • Figure 3: Performance gaps of the three models (state-of-the-art deterministic DeepLabV3+, BayDeepLabV3+, and BayDeepLabV3+Unc) between the three different test sets; (top) comparing performance on single vs. sequence frames from out-of-distribution test set C6 (C6-SIN vs. C6-SEQ), and (bottom) sequence frames from C1 - C5 vs. unseen C6 (C1-5-SEQ vs. C6-SEQ). The subtext above bars indicates the percent decrease in performance gap compared to SOTA; a larger percent decrease and shorter vertical bar length indicate better generalisability.