Table of Contents
Fetching ...

Mitigating the ID-OOD Tradeoff in Open-Set Test-Time Adaptation

Wenjie Zhao, Jia Li, Xin Dong, Yapeng Tian, Yu Xiang, Yunhui Guo

Abstract

Open-set test-time adaptation (OSTTA) addresses the challenge of adapting models to new environments where out-of-distribution (OOD) samples coexist with in-distribution (ID) samples affected by distribution shifts. In such settings, covariate shift-for example, changes in weather conditions such as snow-can alter ID samples, reducing model reliability. Consequently, models must not only correctly classify covariate-shifted ID (csID) samples but also effectively reject covariate-shifted OOD (csOOD) samples. Entropy minimization is a common strategy in test-time adaptation to maintain ID performance under distribution shifts, while entropy maximization is widely applied to enhance OOD detection. Several studies have sought to combine these objectives to tackle the challenges of OSTTA. However, the intrinsic conflict between entropy minimization and maximization inevitably leads to a trade-off between csID classification and csOOD detection. In this paper, we first analyze the limitations of entropy maximization in OSTTA and then introduce an angular loss to regulate feature norm magnitudes, along with a feature-norm loss to suppress csOOD logits, thereby improving OOD detection. These objectives form ROSETTA, a $\underline{r}$obust $\underline{o}$pen-$\underline{se}$t $\underline{t}$est-$\underline{t}$ime $\underline{a}$daptation. Our method achieves strong OOD detection while maintaining high ID classification performance on CIFAR-10-C, CIFAR-100-C, Tiny-ImageNet-C and ImageNet-C. Furthermore, experiments on the Cityscapes validate the method's effectiveness in real-world semantic segmentation, and results on the HAC dataset demonstrate its applicability across different open-set TTA setups.

Mitigating the ID-OOD Tradeoff in Open-Set Test-Time Adaptation

Abstract

Open-set test-time adaptation (OSTTA) addresses the challenge of adapting models to new environments where out-of-distribution (OOD) samples coexist with in-distribution (ID) samples affected by distribution shifts. In such settings, covariate shift-for example, changes in weather conditions such as snow-can alter ID samples, reducing model reliability. Consequently, models must not only correctly classify covariate-shifted ID (csID) samples but also effectively reject covariate-shifted OOD (csOOD) samples. Entropy minimization is a common strategy in test-time adaptation to maintain ID performance under distribution shifts, while entropy maximization is widely applied to enhance OOD detection. Several studies have sought to combine these objectives to tackle the challenges of OSTTA. However, the intrinsic conflict between entropy minimization and maximization inevitably leads to a trade-off between csID classification and csOOD detection. In this paper, we first analyze the limitations of entropy maximization in OSTTA and then introduce an angular loss to regulate feature norm magnitudes, along with a feature-norm loss to suppress csOOD logits, thereby improving OOD detection. These objectives form ROSETTA, a obust pen-t est-ime daptation. Our method achieves strong OOD detection while maintaining high ID classification performance on CIFAR-10-C, CIFAR-100-C, Tiny-ImageNet-C and ImageNet-C. Furthermore, experiments on the Cityscapes validate the method's effectiveness in real-world semantic segmentation, and results on the HAC dataset demonstrate its applicability across different open-set TTA setups.

Paper Structure

This paper contains 23 sections, 9 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: (a) Examples of csID and csOOD samples. Open-set TTA adapts the model to handle both simultaneously. (b) Comparison with recent state-of-the-art methods. Our method achieves strong performance on both tasks.
  • Figure 2: Examples of misclassification under the optimal entropy threshold. Green boxes represent correctly detected samples and red boxes represent misclassified ones, where csID images are often mistaken for csOOD.
  • Figure 3: $l_2$-norm of feature vectors on CIFAR-10-C under various corruption types. The x-axis corresponds to corruption types, and the y-axis indicates the $l_2$-norm of features
  • Figure 4: Overview of the ROSETTA pipeline.
  • Figure 5: Mean sorted logits under different adaptation methods.
  • ...and 3 more figures