Table of Contents
Fetching ...

R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation

Trung-Hieu Hoang, Duc Minh Vo, Minh N. Do

TL;DR

This work reveals a practical vulnerability in continual TTA by introducing Reusing of Incorrect Predictions (RIP), a black-box attack that degrades or collapses models without access to parameters or modified inputs. Grounded in a theoretical Augmented Gaussian Mixture Model Classifier and IPS, the study shows how combining data augmentation with selective sampling can steadily shift decision boundaries toward incorrect predictions. Empirical evaluation across CIFAR-10-C, CIFAR-100-C, and ImageNet-C demonstrates broad susceptibility among recent continual TTA methods, with CoTTA and EATA showing relatively better resilience. The findings highlight the need for defenses against simple, realistic black-box risks in continual TTA and point to critical factors such as augmentation practices, pseudo-label strategies, and update schemes that influence robustness.

Abstract

Test-time adaptation (TTA) has emerged as a promising solution to tackle the continual domain shift in machine learning by allowing model parameters to change at test time, via self-supervised learning on unlabeled testing data. At the same time, it unfortunately opens the door to unforeseen vulnerabilities for degradation over time. Through a simple theoretical continual TTA model, we successfully identify a risk in the sampling process of testing data that could easily degrade the performance of a continual TTA model. We name this risk as Reusing of Incorrect Prediction (RIP) that TTA attackers can employ or as a result of the unintended query from general TTA users. The risk posed by RIP is also highly realistic, as it does not require prior knowledge of model parameters or modification of testing samples. This simple requirement makes RIP as the first black-box TTA attack algorithm that stands out from existing white-box attempts. We extensively benchmark the performance of the most recent continual TTA approaches when facing the RIP attack, providing insights on its success, and laying out potential roadmaps that could enhance the resilience of future continual TTA systems.

R.I.P.: A Simple Black-box Attack on Continual Test-time Adaptation

TL;DR

This work reveals a practical vulnerability in continual TTA by introducing Reusing of Incorrect Predictions (RIP), a black-box attack that degrades or collapses models without access to parameters or modified inputs. Grounded in a theoretical Augmented Gaussian Mixture Model Classifier and IPS, the study shows how combining data augmentation with selective sampling can steadily shift decision boundaries toward incorrect predictions. Empirical evaluation across CIFAR-10-C, CIFAR-100-C, and ImageNet-C demonstrates broad susceptibility among recent continual TTA methods, with CoTTA and EATA showing relatively better resilience. The findings highlight the need for defenses against simple, realistic black-box risks in continual TTA and point to critical factors such as augmentation practices, pseudo-label strategies, and update schemes that influence robustness.

Abstract

Test-time adaptation (TTA) has emerged as a promising solution to tackle the continual domain shift in machine learning by allowing model parameters to change at test time, via self-supervised learning on unlabeled testing data. At the same time, it unfortunately opens the door to unforeseen vulnerabilities for degradation over time. Through a simple theoretical continual TTA model, we successfully identify a risk in the sampling process of testing data that could easily degrade the performance of a continual TTA model. We name this risk as Reusing of Incorrect Prediction (RIP) that TTA attackers can employ or as a result of the unintended query from general TTA users. The risk posed by RIP is also highly realistic, as it does not require prior knowledge of model parameters or modification of testing samples. This simple requirement makes RIP as the first black-box TTA attack algorithm that stands out from existing white-box attempts. We extensively benchmark the performance of the most recent continual TTA approaches when facing the RIP attack, providing insights on its success, and laying out potential roadmaps that could enhance the resilience of future continual TTA systems.

Paper Structure

This paper contains 34 sections, 9 equations, 10 figures, 3 tables, 1 algorithm.

Figures (10)

  • Figure 1: An illustration of our Reusing of Incorrect Predictions (RIP) attack against a continual test-time adaptation (TTA) method. Here the attacker intentionally reuses samples that were incorrectly predicted in the subsequent rounds to make the model more confident in these erroneous predictions. RIP is the first black-box attack that can realistically collapse a TTA model.
  • Figure 2: The key operational steps of a continual TTA method. We extend the model in hoang2024petta (red highlighted) with the augmentation operator Aug($\cdot$) and investigate its effect in a vulnerable scenario where testing samples are not i.i.d. sampled from a distribution $P_t$, but instead selectively sampled to collapse a model.
  • Figure 3: The similarity in the effect of random data augmentation on images of CIFAR-10-C hendrycks2019robustness and synthetic data used in Gaussian Mixture Model Classifier (GMMC). (left) 2D t-SNEmaaten2008_tsne projection of the deep feature vectors of real images. (middle) Samples drawn from two Gaussian distributions and the best theoretical separation boundary on GMMC (solid line). Regions with incorrect predictions are highlighted. (right) An Additive White Gaussian Noise (AWGN) generates augmented samples for GMMC and the decision boundary separates original and augmented samples (dashed line). The shifting of decision boundaries when training with augmented samples is similarly observed on both real images and GMMC simulated data, allowing our analysis to focus on GMMC+AWGN for their simplicity.
  • Figure 4: A step-by-step illustration of the shifting-boundary effect caused by incorrect prediction sampling (IPS), elaborating on Fig. \ref{['fig:t-SNE']}-left. Arrows serve as pointers. (a) Mispredicted samples from a victim class (orange) are sampled for TTA. (b) Randomly augmented variations (denoted by $\times$) are generated, expanding the area (dashed circles) around the original samples. (c) The updated decision boundary expands to cover these samples - highlighted with the blue halo effect, penetrating the victim class. (d) The process repeats, reducing the chance of predicting the victim class.
  • Figure 5: Simulation results on the Gaussian Mixture Model Classifier (GMMC) representing the effect of Incorrect Prediction Sampling (IPS), augmentation operator (Aug), and update rate ($\alpha$). (✓) denotes if the operator is enabled, and (✗) vice versa. The distribution before (top) and after adaptation (bottom) is visualized. The (middle) plot shows the shifting in model prediction (on the same set of samples) after every 20 steps. (a)-(c) GMMC is collapsed if IPS and Aug are simultaneously enabled. (d) Increasing $\alpha$ partially mitigates the collapse.
  • ...and 5 more figures

Theorems & Definitions (1)

  • Definition 1: Model Collapse