Table of Contents
Fetching ...

Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts

Zhitong Gao, Bingnan Li, Mathieu Salzmann, Xuming He

TL;DR

This work designs a novel generative augmentation method to produce coherent images that incorporate both anomaly classes and various covariate shifts at both image and object levels and introduces a training strategy that recalibrates uncertainty specifically for semantic shifts and enhances the feature extractor to align features associated with domain shifts.

Abstract

In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety and generalize to new domains. However, existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts, leading to poor out-of-distribution (OOD) detection or domain generalization performance. In this work, we aim to equip the model to generalize effectively to covariate-shift regions while precisely identifying semantic-shift regions. To achieve this, we design a novel generative augmentation method to produce coherent images that incorporate both anomaly (or novel) objects and various covariate shifts at both image and object levels. Furthermore, we introduce a training strategy that recalibrates uncertainty specifically for semantic shifts and enhances the feature extractor to align features associated with domain shifts. We validate the effectiveness of our method across benchmarks featuring both semantic and domain shifts. Our method achieves state-of-the-art performance across all benchmarks for both OOD detection and domain generalization. Code is available at https://github.com/gaozhitong/MultiShiftSeg.

Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts

TL;DR

This work designs a novel generative augmentation method to produce coherent images that incorporate both anomaly classes and various covariate shifts at both image and object levels and introduces a training strategy that recalibrates uncertainty specifically for semantic shifts and enhances the feature extractor to align features associated with domain shifts.

Abstract

In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety and generalize to new domains. However, existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts, leading to poor out-of-distribution (OOD) detection or domain generalization performance. In this work, we aim to equip the model to generalize effectively to covariate-shift regions while precisely identifying semantic-shift regions. To achieve this, we design a novel generative augmentation method to produce coherent images that incorporate both anomaly (or novel) objects and various covariate shifts at both image and object levels. Furthermore, we introduce a training strategy that recalibrates uncertainty specifically for semantic shifts and enhances the feature extractor to align features associated with domain shifts. We validate the effectiveness of our method across benchmarks featuring both semantic and domain shifts. Our method achieves state-of-the-art performance across all benchmarks for both OOD detection and domain generalization. Code is available at https://github.com/gaozhitong/MultiShiftSeg.

Paper Structure

This paper contains 50 sections, 6 equations, 6 figures, 12 tables.

Figures (6)

  • Figure 1: We study semantic segmentation with both semantic-shift and covariate-shift regions. (a) Training for Out-of-distribution (OOD) detection alone rpl yields high uncertainty for both types of shifts, whereas training for domain generalization (DG) alone robustnet tends to produce low uncertainty for both. Our method effectively differentiates between the two, generating high uncertainty only for semantic-shift regions. (b) We achieve strong performance in both OOD detection and domain-generalized semantic segmentation. (c) This is achieved by coherently augmenting original images (first row) with both covariate and semantic shifts (second row).
  • Figure 2: Method Overview: (a) A novel generative-based data augmentation strategy that supplements training data with both covariate and semantic shifts in a coherent manner. (2) A semantic-exclusive uncertainty function with two-stage noise-aware training to encourage invariant feature learning for covariate-shift regions while maintaining high uncertainty for semantic-shift regions.
  • Figure 3: Comparison of Uncertainty Maps. Our method robustly detects anomalies under covariate shifts across five datasets (first five columns) and generated data (last column). The previous method RPL rpl failed to distinguish domain from semantic shifts, producing high uncertainty in both cases.
  • Figure 4: (a) Visualization of Our Selection Maps. Our selection strategy effectively identifies and removes generation errors (highlighted with boxes). (b) Analysis of Our Two-Stage Training. The first stage of training the uncertainty function boosts baseline performance, and second-stage fine-tuning further improves performance, achieving better results than single-stage training.
  • Figure 5: (a) Impact of Sample Selection Ratio. We report both anomaly segmentation performance(AP$\uparrow$, FPR$\downarrow$ on SMIYC-RA Val) and known class segmentation performance (mIoU$\uparrow$ on MUAD). Experiments are conducted under DeepLab v3+ architecture. (b) Impact of Generated Data Size. We observe an improvement of performance with the increase of generated data size with the same evaluation under Mask2Former architecture.
  • ...and 1 more figures