Table of Contents
Fetching ...

Instance-Aware Test-Time Segmentation for Continual Domain Shifts

Seunghwan Lee, Inyoung Jung, Hojoon Lee, Eunil Park, Sungeun Hong

TL;DR

This work addresses semantic segmentation under continual domain shifts by introducing CoTICA, a framework that performs instance- and class-aware test-time adaptation. ICAT provides per-instance, per-class adaptive thresholds for pseudo labeling, while ICWL uses temporally smoothed class weights to focus learning on consistently hard classes. Together, they yield robust, long-term adaptation across diverse CTTA and TTA benchmarks, outperforming prior methods and reducing error accumulation. The approach advances practical continual adaptation in real-world, non-stationary environments, where pixel-level predictions must remain reliable over time.

Abstract

Continual Test-Time Adaptation (CTTA) enables pre-trained models to adapt to continuously evolving domains. Existing methods have improved robustness but typically rely on fixed or batch-level thresholds, which cannot account for varying difficulty across classes and instances. This limitation is especially problematic in semantic segmentation, where each image requires dense, multi-class predictions. We propose an approach that adaptively adjusts pseudo labels to reflect the confidence distribution within each image and dynamically balances learning toward classes most affected by domain shifts. This fine-grained, class- and instance-aware adaptation produces more reliable supervision and mitigates error accumulation throughout continual adaptation. Extensive experiments across eight CTTA and TTA scenarios, including synthetic-to-real and long-term shifts, show that our method consistently outperforms state-of-the-art techniques, setting a new standard for semantic segmentation under evolving conditions.

Instance-Aware Test-Time Segmentation for Continual Domain Shifts

TL;DR

This work addresses semantic segmentation under continual domain shifts by introducing CoTICA, a framework that performs instance- and class-aware test-time adaptation. ICAT provides per-instance, per-class adaptive thresholds for pseudo labeling, while ICWL uses temporally smoothed class weights to focus learning on consistently hard classes. Together, they yield robust, long-term adaptation across diverse CTTA and TTA benchmarks, outperforming prior methods and reducing error accumulation. The approach advances practical continual adaptation in real-world, non-stationary environments, where pixel-level predictions must remain reliable over time.

Abstract

Continual Test-Time Adaptation (CTTA) enables pre-trained models to adapt to continuously evolving domains. Existing methods have improved robustness but typically rely on fixed or batch-level thresholds, which cannot account for varying difficulty across classes and instances. This limitation is especially problematic in semantic segmentation, where each image requires dense, multi-class predictions. We propose an approach that adaptively adjusts pseudo labels to reflect the confidence distribution within each image and dynamically balances learning toward classes most affected by domain shifts. This fine-grained, class- and instance-aware adaptation produces more reliable supervision and mitigates error accumulation throughout continual adaptation. Extensive experiments across eight CTTA and TTA scenarios, including synthetic-to-real and long-term shifts, show that our method consistently outperforms state-of-the-art techniques, setting a new standard for semantic segmentation under evolving conditions.

Paper Structure

This paper contains 29 sections, 9 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: Examples of semantic segmentation challenges under varying domain shifts, highlighting class-specific variations. (a) Input images captured under nighttime and snowy conditions. (b) Model predictions reveal distinct class-specific challenges: under streetlights at night, the sky is misclassified as trees, while in snowy conditions, snow-covered roads are confused with sidewalks. (c) Ground truth annotations.
  • Figure 2: Comparison of CTTA thresholding. Same colors indicate the same threshold. (a) Fixed threshold in classification: a single value applied to the whole sample. (b) Fixed threshold in segmentation: all pixels share the same threshold, causing pseudo-label errors. (c) Batch-level adaptive threshold: class-wise but shared across instances. (d) Ours: class thresholds are dynamically adjusted per instance for better segmentation adaptation.
  • Figure 3: Class-wise mIoU (%) comparison across different models under environmental changes on the ACDC dataset. The results highlight the need for adaptive strategies tailored to each class to improve segmentation performance.
  • Figure 4: Overview of the proposed method. 1) A student-teacher model adapts to evolving domains using EMA updates. Conditional augmentation enhances pseudo-label robustness. 2) ICAT adjusts pseudo-label thresholds per instance and class using the teacher model’s confidence distribution. The process refines thresholds dynamically to improve adaptation. 3) ICWL computes class-wise importance weights from multiple augmented samples. These weights stabilize adaptation through moving-average updates. 4) Domain-wise Class Thresholds across different domains.
  • Figure 5: Comparison with qualitative results. The yellow bounding box highlights areas with qualitative performance improvements, pinpointing regions that were previously challenging for previous methods.
  • ...and 4 more figures