Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

Luca Mossina; Joseba Dalmau; Léo andéol

Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

Luca Mossina, Joseba Dalmau, Léo andéol

TL;DR

This paper tackles uncertainty quantification for semantic image segmentation by introducing a post-hoc, model-agnostic conformal prediction framework. It constructs multi-labeled pixel-wise prediction sets parameterized by $\lambda$ and selects an optimal $\hat{\lambda}$ using Conformal Risk Control to bound an expected loss $\mathbb{E}[\ell(\mathcal{C}_{\hat{\lambda}}(X),Y)] \le \alpha$, ensuring ground-truth coverage with a finite-sample guarantee. Uncertainty visualization is provided via varisco heatmaps, which depict per-pixel label inclusion and are validated on Cityscapes, ADE20K, and LoveDA with a lightweight, scalable approach. The work yields practical, interpretable uncertainty diagnostics that are compatible with any segmentation predictor that outputs per-pixel softmax scores, enabling safer deployment and potential extensions to panoptic segmentation and real-time data streams.

Abstract

We propose a post-hoc, computationally lightweight method to quantify predictive uncertainty in semantic image segmentation. Our approach uses conformal prediction to generate statistically valid prediction sets that are guaranteed to include the ground-truth segmentation mask at a predefined confidence level. We introduce a novel visualization technique of conformalized predictions based on heatmaps, and provide metrics to assess their empirical validity. We demonstrate the effectiveness of our approach on well-known benchmark datasets and image segmentation prediction models, and conclude with practical insights.

Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

TL;DR

and selects an optimal

using Conformal Risk Control to bound an expected loss

, ensuring ground-truth coverage with a finite-sample guarantee. Uncertainty visualization is provided via varisco heatmaps, which depict per-pixel label inclusion and are validated on Cityscapes, ADE20K, and LoveDA with a lightweight, scalable approach. The work yields practical, interpretable uncertainty diagnostics that are compatible with any segmentation predictor that outputs per-pixel softmax scores, enabling safer deployment and potential extensions to panoptic segmentation and real-time data streams.

Abstract

Paper Structure (23 sections, 1 theorem, 16 equations, 5 figures, 2 tables, 2 algorithms)

This paper contains 23 sections, 1 theorem, 16 equations, 5 figures, 2 tables, 2 algorithms.

Introduction
Contributions
Background
Semantic Image Segmentation.
Conformal Prediction.
Conformal Risk Control.
Related works
Uncertainty quantification methods for semantic image segmentation
Calibration of image segmentation
Applications of CP to segmentation
Conformal Semantic Segmentation
Multi-labeled masks.
Nested multi-labeled masks.
Conformal Risk Control for multi-labeled mask
Computing the optimal $\hat{\lambda}$.
...and 8 more sections

Key Result

Theorem 4.1

Assume that the $L_i(\lambda)$ are non-increasing, right-continuous and bounded by $B<+\infty$. Assume that there exists ${\lambda_{\text{max}}} \in [0,1]$ such that $L_i({\lambda_{\text{max}}}) \leq \alpha$. Assume further that $L_1(\lambda),\dots,L_{n+1}(\lambda)$ form an exchangeable sequence. Le

Figures (5)

Figure 1: Top: A predicted semantic segmentation mask, overlayed on the input image, for the dataset CityscapesCordts_2016_Cityscapes. Bottom: A varisco uncertainty heatmap, for a user-defined risk $\alpha = 0.01$ and a minimum coverage ratio $\tau$ of $99\%$; it is defined in \ref{['eq:prediction-set-lac']} and statistically valid as in \ref{['eq:crc-exp-value-guarantee']} of CRC: every pixel is a prediction set that contains the highest scoring label (top-1) but potentially also the second, third, etc., highest scoring labels.
Figure 2: For three (arbitrary) values $\lambda \in \{0.99, 0.999, 0.9999\}$, we apply \ref{['eq:cp-lac-threshold']} to every pixel and obtain varisco heatmaps, for the dataset CityscapesCordts_2016_Cityscapes. The CRC algorithm described in \ref{['sec:crc-algo']} searches for the optimal $\lambda$ such that, for a given conformalization loss and a risk level $\alpha$, the guarantee in \ref{['eq:crc-exp-value-guarantee']} is attainable.
Figure 3: For the same risk level $\alpha=0.01$, different losses yield different heatmaps: (left) binary loss $\ell_{\text{bin}}$, (center) binary loss with threshold $\ell_{\tau}$, (right) miscoverage loss $\ell$. If the notion of risk is too restrictive, the prediction set will be theoretically valid but not very informative. In this example, the figure on the left (binary loss, $\tau = 1.0$) has most of the pixels of color red, indicating that $K$ (out of $K$) classes are in the prediction set. Dataset: CityscapesCordts_2016_Cityscapes.
Figure 4: Visualization of a varisco heatmaps (miscoverage loss, $\alpha = 0.01$) for the ADE20K dataset Zhou_2017_SceneZhou_2019_Semantic: (left) input image, (center) predicted segmentation mask, (right) varisco heatmap.
Figure 5: Visualization of a varisco heatmaps (miscoverage loss, $\alpha = 0.01$) for the LoveDA dataset Wang_2021_LoveDAWang_2021_LoveDA_dataset: (left) input image, (center) predicted segmentation mask, (right) varisco heatmap.

Theorems & Definitions (1)

Theorem 4.1: Theorem 1 in Angelopoulos_2024_CRC.

Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

TL;DR

Abstract

Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (1)