Model-Free Adversarial Purification via Coarse-To-Fine Tensor Network Representation
Guang Lin, Duc Thien Nguyen, Zerui Tao, Konstantinos Slavakis, Toshihisa Tanaka, Qibin Zhao
TL;DR
The paper tackles the vulnerability of deep neural networks to adversarial examples by proposing Tensor Network Purification (TNP), a model-free adversarial purification method that does not rely on pre-trained generators or dataset priors. TNP uses a coarse-to-fine tensor-network representation and a novel adversarial optimization objective to suppress perturbations while avoiding reconstruction of adversarial artifacts, aided by downsampling that Gaussianizes perturbations. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet show strong generalization across diverse threat models and tasks, including denoising, with substantial improvements over AT and AP baselines. The approach offers a general, training-free purification technique with practical impact, though it incurs higher inference-time costs due to iterative TN optimization.
Abstract
Deep neural networks are known to be vulnerable to well-designed adversarial attacks. Although numerous defense strategies have been proposed, many are tailored to the specific attacks or tasks and often fail to generalize across diverse scenarios. In this paper, we propose Tensor Network Purification (TNP), a novel model-free adversarial purification method by a specially designed tensor network decomposition algorithm. TNP depends neither on the pre-trained generative model nor the specific dataset, resulting in strong robustness across diverse adversarial scenarios. To this end, the key challenge lies in relaxing Gaussian-noise assumptions of classical decompositions and accommodating the unknown distribution of adversarial perturbations. Unlike the low-rank representation of classical decompositions, TNP aims to reconstruct the unobserved clean examples from an adversarial example. Specifically, TNP leverages progressive downsampling and introduces a novel adversarial optimization objective to address the challenge of minimizing reconstruction error but without inadvertently restoring adversarial perturbations. Extensive experiments conducted on CIFAR-10, CIFAR-100, and ImageNet demonstrate that our method generalizes effectively across various norm threats, attack types, and tasks, providing a versatile and promising adversarial purification technique.
