Table of Contents
Fetching ...

Model-Free Adversarial Purification via Coarse-To-Fine Tensor Network Representation

Guang Lin, Duc Thien Nguyen, Zerui Tao, Konstantinos Slavakis, Toshihisa Tanaka, Qibin Zhao

TL;DR

The paper tackles the vulnerability of deep neural networks to adversarial examples by proposing Tensor Network Purification (TNP), a model-free adversarial purification method that does not rely on pre-trained generators or dataset priors. TNP uses a coarse-to-fine tensor-network representation and a novel adversarial optimization objective to suppress perturbations while avoiding reconstruction of adversarial artifacts, aided by downsampling that Gaussianizes perturbations. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet show strong generalization across diverse threat models and tasks, including denoising, with substantial improvements over AT and AP baselines. The approach offers a general, training-free purification technique with practical impact, though it incurs higher inference-time costs due to iterative TN optimization.

Abstract

Deep neural networks are known to be vulnerable to well-designed adversarial attacks. Although numerous defense strategies have been proposed, many are tailored to the specific attacks or tasks and often fail to generalize across diverse scenarios. In this paper, we propose Tensor Network Purification (TNP), a novel model-free adversarial purification method by a specially designed tensor network decomposition algorithm. TNP depends neither on the pre-trained generative model nor the specific dataset, resulting in strong robustness across diverse adversarial scenarios. To this end, the key challenge lies in relaxing Gaussian-noise assumptions of classical decompositions and accommodating the unknown distribution of adversarial perturbations. Unlike the low-rank representation of classical decompositions, TNP aims to reconstruct the unobserved clean examples from an adversarial example. Specifically, TNP leverages progressive downsampling and introduces a novel adversarial optimization objective to address the challenge of minimizing reconstruction error but without inadvertently restoring adversarial perturbations. Extensive experiments conducted on CIFAR-10, CIFAR-100, and ImageNet demonstrate that our method generalizes effectively across various norm threats, attack types, and tasks, providing a versatile and promising adversarial purification technique.

Model-Free Adversarial Purification via Coarse-To-Fine Tensor Network Representation

TL;DR

The paper tackles the vulnerability of deep neural networks to adversarial examples by proposing Tensor Network Purification (TNP), a model-free adversarial purification method that does not rely on pre-trained generators or dataset priors. TNP uses a coarse-to-fine tensor-network representation and a novel adversarial optimization objective to suppress perturbations while avoiding reconstruction of adversarial artifacts, aided by downsampling that Gaussianizes perturbations. Extensive experiments on CIFAR-10, CIFAR-100, and ImageNet show strong generalization across diverse threat models and tasks, including denoising, with substantial improvements over AT and AP baselines. The approach offers a general, training-free purification technique with practical impact, though it incurs higher inference-time costs due to iterative TN optimization.

Abstract

Deep neural networks are known to be vulnerable to well-designed adversarial attacks. Although numerous defense strategies have been proposed, many are tailored to the specific attacks or tasks and often fail to generalize across diverse scenarios. In this paper, we propose Tensor Network Purification (TNP), a novel model-free adversarial purification method by a specially designed tensor network decomposition algorithm. TNP depends neither on the pre-trained generative model nor the specific dataset, resulting in strong robustness across diverse adversarial scenarios. To this end, the key challenge lies in relaxing Gaussian-noise assumptions of classical decompositions and accommodating the unknown distribution of adversarial perturbations. Unlike the low-rank representation of classical decompositions, TNP aims to reconstruct the unobserved clean examples from an adversarial example. Specifically, TNP leverages progressive downsampling and introduces a novel adversarial optimization objective to address the challenge of minimizing reconstruction error but without inadvertently restoring adversarial perturbations. Extensive experiments conducted on CIFAR-10, CIFAR-100, and ImageNet demonstrate that our method generalizes effectively across various norm threats, attack types, and tasks, providing a versatile and promising adversarial purification technique.

Paper Structure

This paper contains 27 sections, 5 equations, 9 figures, 10 tables, 2 algorithms.

Figures (9)

  • Figure 1: Compare the adversarial perturbations in the downsampled images. (a) The distribution changes of adversarial perturbations during downsampling process. More results are shown in \ref{['app:distribution']}. (b) The KL divergence histogram of adversarial perturbations.
  • Figure 2: Illustration of tensor network purification.
  • Figure 3: Comparison of robust accuracy against PGD+EOT and AutoAttack.
  • Figure 4: Visual comparison of the denoising task. Top: the original image and corresponding reconstructed image for (a) the clean example and (b) the adversarial example, using PuTT and our proposed method. Bottom: the error maps are created (a) between the rec. clean example and the original clean example, as well as (b) between the rec. adversarial example and the rec. clean example.
  • Figure 5: Standard accuracy (SA) and robust accuracy (RA) against AutoAttack $l_\infty$ ($\epsilon=8/255$) threat on CIFAR-10 and CIFAR-100 with WideResNet-28-10 classifier. The pre-trained generative model used in AP is trained on CIFAR-10.
  • ...and 4 more figures