Table of Contents
Fetching ...

Multi-Scale Foreground-Background Confidence for Out-of-Distribution Segmentation

Samuel Marschall, Kira Maag

TL;DR

This work addresses open-world OOD segmentation by exploiting confidence from foreground-background segmentation and aggregating it across multiple patch scales. It introduces a two-branch architecture that optionally leverages semantic segmentation uncertainty, combining per-pixel foreground confidence with entropy and road-probability heatmaps via $\hat{\theta}(x) = \sum_{i=1}^d \alpha^i \theta^i(x)$ and $\hat{\theta}(x) * D(x)$. Empirical results on LostAndFound, RoadAnomaly21, and RoadObstacle21 show that multi-scale confidence fusion outperforms uncertainty-based baselines, with notable gains in AuPRC and FPR$_{95}$ across datasets. The method does not require retraining or external OOD data, making it particularly suitable for safety-critical applications like automated driving, where robust detection of unknown objects across scales is essential.

Abstract

Deep neural networks have shown outstanding performance in computer vision tasks such as semantic segmentation and have defined the state-of-the-art. However, these segmentation models are trained on a closed and predefined set of semantic classes, which leads to significant prediction failures in open-world scenarios on unknown objects. As this behavior prevents the application in safety-critical applications such as automated driving, the detection and segmentation of these objects from outside their predefined semantic space (out-of-distribution (OOD) objects) is of the utmost importance. In this work, we present a multi-scale OOD segmentation method that exploits the confidence information of a foreground-background segmentation model. While semantic segmentation models are trained on specific classes, this restriction does not apply to foreground-background methods making them suitable for OOD segmentation. We consider the per pixel confidence score of the model prediction which is close to 1 for a pixel in a foreground object. By aggregating these confidence values for different sized patches, objects of various sizes can be identified in a single image. Our experiments show improved performance of our method in OOD segmentation compared to comparable baselines in the SegmentMeIfYouCan benchmark.

Multi-Scale Foreground-Background Confidence for Out-of-Distribution Segmentation

TL;DR

This work addresses open-world OOD segmentation by exploiting confidence from foreground-background segmentation and aggregating it across multiple patch scales. It introduces a two-branch architecture that optionally leverages semantic segmentation uncertainty, combining per-pixel foreground confidence with entropy and road-probability heatmaps via and . Empirical results on LostAndFound, RoadAnomaly21, and RoadObstacle21 show that multi-scale confidence fusion outperforms uncertainty-based baselines, with notable gains in AuPRC and FPR across datasets. The method does not require retraining or external OOD data, making it particularly suitable for safety-critical applications like automated driving, where robust detection of unknown objects across scales is essential.

Abstract

Deep neural networks have shown outstanding performance in computer vision tasks such as semantic segmentation and have defined the state-of-the-art. However, these segmentation models are trained on a closed and predefined set of semantic classes, which leads to significant prediction failures in open-world scenarios on unknown objects. As this behavior prevents the application in safety-critical applications such as automated driving, the detection and segmentation of these objects from outside their predefined semantic space (out-of-distribution (OOD) objects) is of the utmost importance. In this work, we present a multi-scale OOD segmentation method that exploits the confidence information of a foreground-background segmentation model. While semantic segmentation models are trained on specific classes, this restriction does not apply to foreground-background methods making them suitable for OOD segmentation. We consider the per pixel confidence score of the model prediction which is close to 1 for a pixel in a foreground object. By aggregating these confidence values for different sized patches, objects of various sizes can be identified in a single image. Our experiments show improved performance of our method in OOD segmentation compared to comparable baselines in the SegmentMeIfYouCan benchmark.

Paper Structure

This paper contains 17 sections, 4 equations, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Top: Semantic segmentation predicted by a DNN. Bottom: Confidence heatmap obtained by our method.
  • Figure 2: Schematic illustration of our multi-scale OOD segmentation method. On the one hand, the input image is divided into different sized slices, inferred by the foreground-background model and the confidence heatmaps are aggregated into a single output map. On the other hand, the image is fed into a semantic segmentation network, which outputs an uncertainty heatmap, which is then combined with the confidence map of the foreground-background model to obtain the final OOD segmentation.
  • Figure 3: Top: RGB images of the LostAndFound, RoadAnomaly21 and RoadObstacle21 dataset. Bottom: The corresponding OOD segmentation heatmaps obtained by our method.
  • Figure 4: Three different patch schemes applied to the LostAndFound dataset.