Table of Contents
Fetching ...

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong, Yu Xiang, Yunhui Guo

TL;DR

A method to convert anomaly Score To segmentation Mask, called S2M, a simple and effective framework for OoD detection in semantic segmentation by transforming anomaly scores into prompts for a promptable segmentation model, which eliminates the need for thresh- old selection.

Abstract

Semantic segmentation models, while effective for in-distribution categories, face challenges in real-world deployment due to encountering out-of-distribution (OoD) objects. Detecting these OoD objects is crucial for safety-critical applications. Existing methods rely on anomaly scores, but choosing a suitable threshold for generating masks presents difficulties and can lead to fragmentation and inaccuracy. This paper introduces a method to convert anomaly \textbf{S}core \textbf{T}o segmentation \textbf{M}ask, called S2M, a simple and effective framework for OoD detection in semantic segmentation. Unlike assigning anomaly scores to pixels, S2M directly segments the entire OoD object. By transforming anomaly scores into prompts for a promptable segmentation model, S2M eliminates the need for threshold selection. Extensive experiments demonstrate that S2M outperforms the state-of-the-art by approximately 20% in IoU and 40% in mean F1 score, on average, across various benchmarks including Fishyscapes, Segment-Me-If-You-Can, and RoadAnomaly datasets.

Segment Every Out-of-Distribution Object

TL;DR

A method to convert anomaly Score To segmentation Mask, called S2M, a simple and effective framework for OoD detection in semantic segmentation by transforming anomaly scores into prompts for a promptable segmentation model, which eliminates the need for thresh- old selection.

Abstract

Semantic segmentation models, while effective for in-distribution categories, face challenges in real-world deployment due to encountering out-of-distribution (OoD) objects. Detecting these OoD objects is crucial for safety-critical applications. Existing methods rely on anomaly scores, but choosing a suitable threshold for generating masks presents difficulties and can lead to fragmentation and inaccuracy. This paper introduces a method to convert anomaly \textbf{S}core \textbf{T}o segmentation \textbf{M}ask, called S2M, a simple and effective framework for OoD detection in semantic segmentation. Unlike assigning anomaly scores to pixels, S2M directly segments the entire OoD object. By transforming anomaly scores into prompts for a promptable segmentation model, S2M eliminates the need for threshold selection. Extensive experiments demonstrate that S2M outperforms the state-of-the-art by approximately 20% in IoU and 40% in mean F1 score, on average, across various benchmarks including Fishyscapes, Segment-Me-If-You-Can, and RoadAnomaly datasets.
Paper Structure (26 sections, 10 equations, 14 figures, 9 tables)

This paper contains 26 sections, 10 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: Compared with the state-of-the-art Out-of-Distribution (OoD) detection methods in semantic segmentation, our method excels in producing high-quality masks for OoD objects. The top row displays several real-world images, highlighting anomalous objects with blue bounding boxes. Subsequent rows present masks generated by different methods for the OoD objects, including PEBAL PEBAL, RPL RPL and our method S2M. For PEBAL and RPL, the masks are derived from anomaly scores using the optimal threshold specific to each dataset. Unlike other methods that frequently generate noise outside of OoD objects and exhibit fragmented masks, S2M delivers precise masks for the OoD object.
  • Figure 2: Existing anomaly score-based OoD detection method such as RPL RPL is sensitive to the thresholds while S2M eliminates the need for threshold selection, which is more practical. Besides, S2M also gives a more precise mask.
  • Figure 3: Overview of the training pipeline. We frozen the OoD Detector and only train prompt generator.
  • Figure 4: Box prompts can lead to more accurate segmentation for the OoD objects compared to point prompts. We derive point prompts from the locations corresponding to the extreme values of the anomaly scores. Visual analysis indicates that box prompts substantially improve the model's tolerance to noise. We employ all generated box prompts to obtain a comprehensive mask of the OoD object, which ensures that the entire object is covered.
  • Figure 5: Overview of the inference pipeline.
  • ...and 9 more figures