On the Adversarial Risk of Test Time Adaptation: An Investigation into Realistic Test-Time Data Poisoning
Yongyi Su, Yushu Li, Nanqing Liu, Kui Jia, Xulei Yang, Chuan-Sheng Foo, Xun Xu
TL;DR
The paper tackles the problem of adversarial risk in test-time adaptation by proposing Realistic Test-Time Data Poisoning (RTTDP), a grey-box, online, data-poisoning protocol that avoids access to benign data and online model weights. It introduces a surrogate-model distillation mechanism and an in-distribution attack objective with feature distribution regularization to craft poisoned data that generalizes to benign samples under RTTDP. Two TTA-specific poisoning objectives, Notch High Entropy (NHE) and Balanced Low Entropy (BLE), are proposed to exploit self-training dynamics, and extensive experiments across CIFAR-10/100-C and ImageNet-C demonstrate the effectiveness and limitations of RTTDP against state-of-the-art TTA methods, alongside defense strategies such as entropy-thresholding and EMA. The findings indicate that prior claims of catastrophic vulnerability may be overstated under realistic constraints, while also providing concrete guidelines for designing adversarially robust TTA methods and defenses with practical impact for deployment in cloud-based services.
Abstract
Test-time adaptation (TTA) updates the model weights during the inference stage using testing data to enhance generalization. However, this practice exposes TTA to adversarial risks. Existing studies have shown that when TTA is updated with crafted adversarial test samples, also known as test-time poisoned data, the performance on benign samples can deteriorate. Nonetheless, the perceived adversarial risk may be overstated if the poisoned data is generated under overly strong assumptions. In this work, we first review realistic assumptions for test-time data poisoning, including white-box versus grey-box attacks, access to benign data, attack order, and more. We then propose an effective and realistic attack method that better produces poisoned samples without access to benign samples, and derive an effective in-distribution attack objective. We also design two TTA-aware attack objectives. Our benchmarks of existing attack methods reveal that the TTA methods are more robust than previously believed. In addition, we analyze effective defense strategies to help develop adversarially robust TTA methods. The source code is available at https://github.com/Gorilla-Lab-SCUT/RTTDP.
