Uncertainty-Gated Region-Level Retrieval for Robust Semantic Segmentation
Shreshth Rajan, Raymond Liu
TL;DR
The paper addresses robust semantic segmentation under domain shift in outdoor street scenes. It introduces a region-level uncertainty-gated retrieval framework that uses a memory bank of similar regions and DINOv2 embeddings to refine uncertain regions without retraining. A two-stage gating mechanism—first filtering by moderate epistemic uncertainty via mutual information, then ranking by semantic similarity—achieves substantial IoU gains while dramatically cutting retrieval cost. The work demonstrates that selective, GT-free retrieval improves accuracy and efficiency, highlighting practical benefits for real-time, cross-domain segmentation.
Abstract
Semantic segmentation of outdoor street scenes plays a key role in applications such as autonomous driving, mobile robotics, and assistive technology for visually-impaired pedestrians. For these applications, accurately distinguishing between key surfaces and objects such as roads, sidewalks, vehicles, and pedestrians is essential for maintaining safety and minimizing risks. Semantic segmentation must be robust to different environments, lighting and weather conditions, and sensor noise, while being performed in real-time. We propose a region-level, uncertainty-gated retrieval mechanism that improves segmentation accuracy and calibration under domain shift. Our best method achieves an 11.3% increase in mean intersection-over-union while reducing retrieval cost by 87.5%, retrieving for only 12.5% of regions compared to 100% for always-on baseline.
