Space Rotation with Basis Transformation for Training-free Test-Time Adaptation
Chenhao Ding, Xinyuan Gao, Songlin Dong, Yuhang He, Qiang Wang, Xiang Song, Alex Kot, Yihong Gong
TL;DR
This work tackles test-time adaptation under distribution shift for vision-language models by addressing the rigidity of the original CLIP feature space. It introduces Space Rotation with Basis Transformation (SOBA), which builds an orthogonal basis from a covariance-informed PCA and rotates the feature space to yield clearer inter-class separation, enabling better inference without any training. A dynamic queue of pseudo-labeled samples guides basis construction, while a transformed prototype-based classifier complements the CLIP predictions, with results showing state-of-the-art performance and improved efficiency over training-based and other training-free TTA methods. The approach is simple to implement, accelerates inference, and demonstrates robust generalization across ImageNet-based OOD data and diverse cross-dataset tasks, highlighting the practical impact of feature-space redesign for test-time adaptation.
Abstract
With the development of visual-language models (VLM) in downstream task applications, test-time adaptation methods based on VLM have attracted increasing attention for their ability to address changes distribution in test-time. Although prior approaches have achieved some progress, they typically either demand substantial computational resources or are constrained by the limitations of the original feature space, rendering them less effective for test-time adaptation tasks. To address these challenges, we propose a training-free feature space rotation with basis transformation for test-time adaptation. By leveraging the inherent distinctions among classes, we reconstruct the original feature space and map it to a new representation, thereby enhancing the clarity of class differences and providing more effective guidance for the model during testing. Additionally, to better capture relevant information from various classes, we maintain a dynamic queue to store representative samples. Experimental results across multiple benchmarks demonstrate that our method outperforms state-of-the-art techniques in terms of both performance and efficiency.
