Test-Time Adaptation by Causal Trimming
Yingnan Liu, Rui Qiao, Mong Li Lee, Wynne Hsu
TL;DR
TACT tackles distribution shifts by explicitly mitigating non-causal features in test-time representations. It uses augmentations that preserve causal content to reveal non-causal directions via PCA on augmented representations, then trims both the sample representation $z$ and class prototypes $\{q_i\}$ by removing projections onto the top-$m$ non-causal directions, with moving-average prototypes for stability. Theoretical results provide conditions under which trimming corrects misclassifications and preserves causal predictions, and empirical results on five real-world OOD benchmarks demonstrate strong improvements over both backpropagation-free and backpropagation-based TTA baselines. By reducing reliance on unstable non-causal signals, TACT improves prediction reliability under distribution shifts and can synergize with training-time augmentations, offering a practical, gradient-free adaptation strategy. Future work includes identifying non-causal features without prior augmentation knowledge and exploring alternative directions beyond PCA for robust non-causal disentanglement.
Abstract
Test-time adaptation aims to improve model robustness under distribution shifts by adapting models with access to unlabeled target samples. A primary cause of performance degradation under such shifts is the model's reliance on features that lack a direct causal relationship with the prediction target. We introduce Test-time Adaptation by Causal Trimming (TACT), a method that identifies and removes non-causal components from representations for test distributions. TACT applies data augmentations that preserve causal features while varying non-causal ones. By analyzing the changes in the representations using Principal Component Analysis, TACT identifies the highest variance directions associated with non-causal features. It trims the representations by removing their projections on the identified directions, and uses the trimmed representations for the predictions. During adaptation, TACT continuously tracks and refines these directions to get a better estimate of non-causal features. We theoretically analyze the effectiveness of this approach and empirically validate TACT on real-world out-of-distribution benchmarks. TACT consistently outperforms state-of-the-art methods by a significant margin.
