Mitigating the Bias in the Model for Continual Test-Time Adaptation

Inseop Chung; Kyomin Hwang; Jayeon Yoo; Nojun Kwak

Mitigating the Bias in the Model for Continual Test-Time Adaptation

Inseop Chung, Kyomin Hwang, Jayeon Yoo, Nojun Kwak

TL;DR

Continual Test-Time Adaptation (CTA) faces non-stationary target distributions and predictions that become biased and overconfident as the model online-adapts. The paper introduces two key components: (i) an EMA Target Domain Prototypical Loss that maintains class-wise target prototypes P^t and updates them with reliable target samples to achieve class-wise clustering, and (ii) a Source Distribution Alignment via Prototype Matching that minimizes the distance between target features and precomputed source prototypes P^s to constrain drift from the source. The overall objective combines an unsupervised CTA loss with λ_{ema} L_{ema} and λ_{src} L_{src}, enabling plug-and-play integration with existing CTA methods. Empirically, the approach yields notable improvements on ImageNet-C and CIFAR100-C with minimal adaptation-time overhead, while also reducing prediction bias and improving calibration, demonstrating practical impact for robust online deployment in dynamic environments.

Abstract

Continual Test-Time Adaptation (CTA) is a challenging task that aims to adapt a source pre-trained model to continually changing target domains. In the CTA setting, a model does not know when the target domain changes, thus facing a drastic change in the distribution of streaming inputs during the test-time. The key challenge is to keep adapting the model to the continually changing target domains in an online manner. We find that a model shows highly biased predictions as it constantly adapts to the chaining distribution of the target data. It predicts certain classes more often than other classes, making inaccurate over-confident predictions. This paper mitigates this issue to improve performance in the CTA scenario. To alleviate the bias issue, we make class-wise exponential moving average target prototypes with reliable target samples and exploit them to cluster the target features class-wisely. Moreover, we aim to align the target distributions to the source distribution by anchoring the target feature to its corresponding source prototype. With extensive experiments, our proposed method achieves noteworthy performance gain when applied on top of existing CTA methods without substantial adaptation time overhead.

Mitigating the Bias in the Model for Continual Test-Time Adaptation

TL;DR

Abstract

Paper Structure (23 sections, 6 equations, 8 figures, 9 tables, 1 algorithm)

This paper contains 23 sections, 6 equations, 8 figures, 9 tables, 1 algorithm.

Introduction
Related Works
Test-Time Adaptation
Continual Test-Time Adaptation (CTA)
CTA under Dynamic Scenarios
Problem Definition
Proposed Method
EMA Target Domain Prototypical Loss
Source Distribution Alignment via Prototype Matching
Overall Objective
Experiments
Performance Comparison
Analysis
Conclusion
Social Impacts
...and 8 more sections

Figures (8)

Figure 1: Comparison of the number of predicted samples per class and distribution of confidence between EATA and EATA+Ours.
Figure 2: Before deploying the model, we generate the source prototypes ($P^s$s) using the subset of source data and the source pre-trained feature extractor, $f_{\phi_0}$. After the model is deployed to the target domain, the model adapts to the target data by minimizing our proposed terms $\mathcal{L}_{ema}$ and $\mathcal{L}_{src}$ along with $\mathcal{L}_{unsup}$. We construct class-wise target prototypes ($P^t$s) that are updated with target features via EMA manner. We utilize both $P^t$s and $P^s$s to compute $\mathcal{L}_{ema}$ and $\mathcal{L}_{src}$ respectively. Note that $\mathcal{L}_{ema}$ is first computed and then followed by updating the $P^t$s subsequently. The dotted line indicates providing required information such as entropy and pseudo-label of input.
Figure 3: Comparison of average adaptation time of a single batch across target domains on ImageNet-C.
Figure 4: Analysis of batch size, $\alpha$, $\lambda_{ema}$ and $\lambda_{src}$ on ImageNet-C. (a) presents a comparison between EATA and EATA+Ours with varying batch sizes, while (b), (c), and (d) show performance analysis using different $\alpha$, $\lambda_{ema}$ and $\lambda_{src}$ employed in our method. Accuracy (%) is the average accuracy over the 16 test domains.
Figure 5: Feature space distance analysis. (a) plots the domain gap between the source and the target. (b) and (c) show the intra-class and the inter-class distance, respectively, while (d) presents the ratio (intra/inter) of the two distance.
...and 3 more figures

Mitigating the Bias in the Model for Continual Test-Time Adaptation

TL;DR

Abstract

Mitigating the Bias in the Model for Continual Test-Time Adaptation

Authors

TL;DR

Abstract

Table of Contents

Figures (8)