R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation
Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do
TL;DR
This work reveals a practical vulnerability in continual TTA by introducing Reusing of Incorrect Predictions (RIP), a black-box attack that degrades or collapses models without access to parameters or modified inputs. Grounded in a theoretical Augmented Gaussian Mixture Model Classifier and IPS, the study shows how combining data augmentation with selective sampling can steadily shift decision boundaries toward incorrect predictions. Empirical evaluation across CIFAR-10-C, CIFAR-100-C, and ImageNet-C demonstrates broad susceptibility among recent continual TTA methods, with CoTTA and EATA showing relatively better resilience. The findings highlight the need for defenses against simple, realistic black-box risks in continual TTA and point to critical factors such as augmentation practices, pseudo-label strategies, and update schemes that influence robustness.
Abstract
Test-time adaptation (TTA) has emerged as a promising solution to tackle the continual domain shift in machine learning by allowing model parameters to change at test time, via self-supervised learning on unlabeled testing data. At the same time, it unfortunately opens the door to unforeseen vulnerabilities for degradation over time. Through a simple theoretical continual TTA model, we successfully identify a risk in the sampling process of testing data that could easily degrade the performance of a continual TTA model. We name this risk as Reusing of Incorrect Prediction (RIP) that TTA attackers can employ or as a result of the unintended query from general TTA users. The risk posed by RIP is also highly realistic, as it does not require prior knowledge of model parameters or modification of testing samples. This simple requirement makes RIP as the first black-box TTA attack algorithm that stands out from existing white-box attempts. We extensively benchmark the performance of the most recent continual TTA approaches when facing the RIP attack, providing insights on its success, and laying out potential roadmaps that could enhance the resilience of future continual TTA systems.
