Table of Contents
Fetching ...

Uncertainty-Gated Region-Level Retrieval for Robust Semantic Segmentation

Shreshth Rajan, Raymond Liu

TL;DR

The paper addresses robust semantic segmentation under domain shift in outdoor street scenes. It introduces a region-level uncertainty-gated retrieval framework that uses a memory bank of similar regions and DINOv2 embeddings to refine uncertain regions without retraining. A two-stage gating mechanism—first filtering by moderate epistemic uncertainty via mutual information, then ranking by semantic similarity—achieves substantial IoU gains while dramatically cutting retrieval cost. The work demonstrates that selective, GT-free retrieval improves accuracy and efficiency, highlighting practical benefits for real-time, cross-domain segmentation.

Abstract

Semantic segmentation of outdoor street scenes plays a key role in applications such as autonomous driving, mobile robotics, and assistive technology for visually-impaired pedestrians. For these applications, accurately distinguishing between key surfaces and objects such as roads, sidewalks, vehicles, and pedestrians is essential for maintaining safety and minimizing risks. Semantic segmentation must be robust to different environments, lighting and weather conditions, and sensor noise, while being performed in real-time. We propose a region-level, uncertainty-gated retrieval mechanism that improves segmentation accuracy and calibration under domain shift. Our best method achieves an 11.3% increase in mean intersection-over-union while reducing retrieval cost by 87.5%, retrieving for only 12.5% of regions compared to 100% for always-on baseline.

Uncertainty-Gated Region-Level Retrieval for Robust Semantic Segmentation

TL;DR

The paper addresses robust semantic segmentation under domain shift in outdoor street scenes. It introduces a region-level uncertainty-gated retrieval framework that uses a memory bank of similar regions and DINOv2 embeddings to refine uncertain regions without retraining. A two-stage gating mechanism—first filtering by moderate epistemic uncertainty via mutual information, then ranking by semantic similarity—achieves substantial IoU gains while dramatically cutting retrieval cost. The work demonstrates that selective, GT-free retrieval improves accuracy and efficiency, highlighting practical benefits for real-time, cross-domain segmentation.

Abstract

Semantic segmentation of outdoor street scenes plays a key role in applications such as autonomous driving, mobile robotics, and assistive technology for visually-impaired pedestrians. For these applications, accurately distinguishing between key surfaces and objects such as roads, sidewalks, vehicles, and pedestrians is essential for maintaining safety and minimizing risks. Semantic segmentation must be robust to different environments, lighting and weather conditions, and sensor noise, while being performed in real-time. We propose a region-level, uncertainty-gated retrieval mechanism that improves segmentation accuracy and calibration under domain shift. Our best method achieves an 11.3% increase in mean intersection-over-union while reducing retrieval cost by 87.5%, retrieving for only 12.5% of regions compared to 100% for always-on baseline.

Paper Structure

This paper contains 10 sections, 5 figures.

Figures (5)

  • Figure 1: Retrieval is used on regions of high uncertainty to improve the robustness of outputs from a lightweight segmentation model.
  • Figure 2: Uncertainty increases with severity of certain data augmentations.
  • Figure 3: (A) Combining uncertainty and performance outperforms individual signals. (B,C) Gating on the top 25% of regions based on the combined metric achieves the best cost-performance tradeoff.
  • Figure 4: Overview of tested gating methods.
  • Figure 5: Final results for refined uncertainty and similarity gating.