Variational Continual Test-Time Adaptation
Fan Lyu, Kaile Du, Yuyang Li, Hanyu Zhao, Fuyuan Hu, Zhang Zhang, Guangcan Liu, Liang Wang
TL;DR
<3-5 sentence high-level summary> VCoTTA tackles continual test-time adaptation by embedding uncertainty into a pretrained model via variational warm-up to form a Bayesian neural network, and by using a mean-teacher framework during testing to supervise online adaptation. The method introduces an adaptive prior mixture, combining the source prior and a teacher prior, with the ELBO expressed as a cross-entropy between student and teacher plus the KL divergence to the mixed prior. This uncertainty-aware approach mitigates error accumulation under persistent domain shifts and yields improved calibration and robustness across CTTA benchmarks. The work demonstrates that dynamically weighting priors based on uncertainty can outperform existing CTTA strategies, especially in long-horizon, unlabeled settings.
Abstract
Continual Test-Time Adaptation (CTTA) task investigates effective domain adaptation under the scenario of continuous domain shifts during testing time. Due to the utilization of solely unlabeled samples, there exists significant uncertainty in model updates, leading CTTA to encounter severe error accumulation issues. In this paper, we introduce VCoTTA, a variational Bayesian approach to measure uncertainties in CTTA. At the source stage, we transform a pretrained deterministic model into a Bayesian Neural Network (BNN) via a variational warm-up strategy, injecting uncertainties into the model. During the testing time, we employ a mean-teacher update strategy using variational inference for the student model and exponential moving average for the teacher model. Our novel approach updates the student model by combining priors from both the source and teacher models. The evidence lower bound is formulated as the cross-entropy between the student and teacher models, along with the Kullback-Leibler (KL) divergence of the prior mixture. Experimental results on three datasets demonstrate the method's effectiveness in mitigating error accumulation within the CTTA framework.
