Improving Adversarial Robustness for 3D Point Cloud Recognition at Test-Time through Purified Self-Training
Jinpeng Lin, Xulei Yang, Tianrui Li, Xun Xu
TL;DR
The paper tackles the fragility of 3D point cloud classifiers to adversarial attacks at inference by integrating purification with test-time self-training. It introduces Purified Self-Training (PST), which purifies adversarial inputs and then adapts the model on-the-fly using high-confidence pseudo labels, an adaptive global threshold $ au_g$, and feature distribution alignment via KL divergence. This approach is designed to cope with continually changing attack types in streaming test data and is shown to be complementary to existing purification techniques, yielding state-of-the-art robustness under both white-box and adaptive attacks across multiple backbones and datasets. The authors also propose a realistic streaming evaluation protocol (Single and Mixed Attacks) to reflect real-world adversarial dynamics and demonstrate substantial performance gains in these scenarios.
Abstract
Recognizing 3D point cloud plays a pivotal role in many real-world applications. However, deploying 3D point cloud deep learning model is vulnerable to adversarial attacks. Despite many efforts into developing robust model by adversarial training, they may become less effective against emerging attacks. This limitation motivates the development of adversarial purification which employs generative model to mitigate the impact of adversarial attacks. In this work, we highlight the remaining challenges from two perspectives. First, the purification based method requires retraining the classifier on purified samples which introduces additional computation overhead. Moreover, in a more realistic scenario, testing samples arrives in a streaming fashion and adversarial samples are not isolated from clean samples. These challenges motivates us to explore dynamically update model upon observing testing samples. We proposed a test-time purified self-training strategy to achieve this objective. Adaptive thresholding and feature distribution alignment are introduced to improve the robustness of self-training. Extensive results on different adversarial attacks suggest the proposed method is complementary to purification based method in handling continually changing adversarial attacks on the testing data stream.
