Backpropagation-Free Test-Time Adaptation via Probabilistic Gaussian Alignment
Youjia Zhang, Youngeun Kim, Young-Geun Choi, Hongyeob Kim, Huiling Liu, Sungeun Hong
TL;DR
This work tackles robust test-time adaptation for vision-language models under distribution shifts without relying on gradient-based updates. It reframes TTA as probabilistic Gaussian inference, modeling class-conditional likelihoods with a shared covariance and class means updated from high-confidence test features via constructed knowledge banks, with CLIP priors providing regularization. The approach yields a closed-form, training-free online and transductive solution, including Bayesian ridge-based covariance handling and one-pass label updates. Across natural shifts, corruptions, and fine-grained tasks, ADAPT achieves state-of-the-art performance among backpropagation-free methods, offering strong scalability, low latency, and stability for real-world deployment.
Abstract
Test-time adaptation (TTA) enhances the zero-shot robustness under distribution shifts by leveraging unlabeled test data during inference. Despite notable advances, several challenges still limit its broader applicability. First, most methods rely on backpropagation or iterative optimization, which limits scalability and hinders real-time deployment. Second, they lack explicit modeling of class-conditional feature distributions. This modeling is crucial for producing reliable decision boundaries and calibrated predictions, but it remains underexplored due to the lack of both source data and supervision at test time. In this paper, we propose ADAPT, an Advanced Distribution-Aware and backPropagation-free Test-time adaptation method. We reframe TTA as a Gaussian probabilistic inference task by modeling class-conditional likelihoods using gradually updated class means and a shared covariance matrix. This enables closed-form, training-free inference. To correct potential likelihood bias, we introduce lightweight regularization guided by CLIP priors and a historical knowledge bank. ADAPT requires no source data, no gradient updates, and no full access to target data, supporting both online and transductive settings. Extensive experiments across diverse benchmarks demonstrate that our method achieves state-of-the-art performance under a wide range of distribution shifts with superior scalability and robustness.
