Table of Contents
Fetching ...

Controlling False Positives in Image Segmentation via Conformal Prediction

Luca Mossina, Corentin Friedrich

TL;DR

This work tackles the lack of statistical guarantees in semantic segmentation by introducing a post-hoc conformal prediction framework that produces a confidence mask ${I}_{\lambda}(X)$ by shrinkage of a pretrained prediction $\hat{Y}$, either through sigmoid-score thresholding or morphological erosion. Using a labeled calibration set, it selects a single shrink parameter via inductive conformal prediction to ensure that the accepted false-positive proportion in the confidence mask is at most a user-specified level $\tau$ with probability at least $1-\alpha$ for new images that are exchangeable with the calibration data. The approach is model-agnostic and requires no retraining, providing finite-sample, distribution-free guarantees at the image level, and yields a clear uncertainty region ${U}_{\lambda}(X)=\hat{Y}\setminus {I}_{\lambda}(X)$. Experiments on a polyp segmentation benchmark show that the conformalized methods achieve empirical validity close to the nominal target across $\tau$ values, while offering a transparent trade-off between mask contraction and FP control. This framework enables practical, risk-aware segmentation in clinical settings where over-segmentation can have significant consequences and supports evaluation of third-party predictors under rigorous guarantees.

Abstract

Reliable semantic segmentation is essential for clinical decision making, yet deep models rarely provide explicit statistical guarantees on their errors. We introduce a simple post-hoc framework that constructs confidence masks with distribution-free, image-level control of false-positive predictions. Given any pretrained segmentation model, we define a nested family of shrunken masks obtained either by increasing the score threshold or by applying morphological erosion. A labeled calibration set is used to select a single shrink parameter via conformal prediction, ensuring that, for new images that are exchangeable with the calibration data, the proportion of false positives retained in the confidence mask stays below a user-specified tolerance with high probability. The method is model-agnostic, requires no retraining, and provides finite-sample guarantees regardless of the underlying predictor. Experiments on a polyp-segmentation benchmark demonstrate target-level empirical validity. Our framework enables practical, risk-aware segmentation in settings where over-segmentation can have clinical consequences. Code at https://github.com/deel-ai-papers/conseco.

Controlling False Positives in Image Segmentation via Conformal Prediction

TL;DR

This work tackles the lack of statistical guarantees in semantic segmentation by introducing a post-hoc conformal prediction framework that produces a confidence mask by shrinkage of a pretrained prediction , either through sigmoid-score thresholding or morphological erosion. Using a labeled calibration set, it selects a single shrink parameter via inductive conformal prediction to ensure that the accepted false-positive proportion in the confidence mask is at most a user-specified level with probability at least for new images that are exchangeable with the calibration data. The approach is model-agnostic and requires no retraining, providing finite-sample, distribution-free guarantees at the image level, and yields a clear uncertainty region . Experiments on a polyp segmentation benchmark show that the conformalized methods achieve empirical validity close to the nominal target across values, while offering a transparent trade-off between mask contraction and FP control. This framework enables practical, risk-aware segmentation in clinical settings where over-segmentation can have significant consequences and supports evaluation of third-party predictors under rigorous guarantees.

Abstract

Reliable semantic segmentation is essential for clinical decision making, yet deep models rarely provide explicit statistical guarantees on their errors. We introduce a simple post-hoc framework that constructs confidence masks with distribution-free, image-level control of false-positive predictions. Given any pretrained segmentation model, we define a nested family of shrunken masks obtained either by increasing the score threshold or by applying morphological erosion. A labeled calibration set is used to select a single shrink parameter via conformal prediction, ensuring that, for new images that are exchangeable with the calibration data, the proportion of false positives retained in the confidence mask stays below a user-specified tolerance with high probability. The method is model-agnostic, requires no retraining, and provides finite-sample guarantees regardless of the underlying predictor. Experiments on a polyp-segmentation benchmark demonstrate target-level empirical validity. Our framework enables practical, risk-aware segmentation in settings where over-segmentation can have clinical consequences. Code at https://github.com/deel-ai-papers/conseco.

Paper Structure

This paper contains 9 sections, 9 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Example with erosion inner mask ${I}_{\lambda}^\varepsilon(X)$ at $\tau=0.01$ and $1-\alpha=0.9$. From the left: (i) Ground-truth mask $Y$ overlayed on input image $X$; (ii) true positives & false positives in $\hat{Y}$, and $Y$ pixels missed (false negatives); (iii) Inner "confidence" mask ${I}_{\lambda}^\varepsilon(X)$ (dark grey) and uncertainty "rejection" mask ${U}_{\lambda}(X)$ (light grey); (iv) $\hat{Y}$ is shrunk to ${I}_{\lambda}^\varepsilon(X)$. ${U}_{\lambda}(X)$ rejects most FPs ( ) but also some TPs, i.e. g.t. pixels ( ) well-segmented in $\hat{Y}$. Colors.: true mask $Y$; : false positives (FP); : true positives (TP); : rejected FPs.
  • Figure 2: Examples with thresholding inner mask ${I}_{\lambda}^\sigma(X),$ at $\tau=0.01$ and $1-\alpha=0.9$. Top. Large FP removal with moderate TP loss. Middle. When FPs are already negligible, shrinkage removes TPs. Bottom. Failure case: residual inner mask is FP-only. Colors. : true mask $Y$; : false positives; : true positives; : rejected false positives.