Table of Contents
Fetching ...

Low-Rank Adversarial PGD Attack

Dayana Savostianova, Emanuele Zangrando, Francesco Tudisco

TL;DR

This work observes that in many cases, the perturbations computed using PGD predominantly affect only a portion of the singular value spectrum of the original image, suggesting that these perturbations are approximately low-rank, and proposes a variation of PGD that efficiently computes a low-rank attack.

Abstract

Adversarial attacks on deep neural network models have seen rapid development and are extensively used to study the stability of these networks. Among various adversarial strategies, Projected Gradient Descent (PGD) is a widely adopted method in computer vision due to its effectiveness and quick implementation, making it suitable for adversarial training. In this work, we observe that in many cases, the perturbations computed using PGD predominantly affect only a portion of the singular value spectrum of the original image, suggesting that these perturbations are approximately low-rank. Motivated by this observation, we propose a variation of PGD that efficiently computes a low-rank attack. We extensively validate our method on a range of standard models as well as robust models that have undergone adversarial training. Our analysis indicates that the proposed low-rank PGD can be effectively used in adversarial training due to its straightforward and fast implementation coupled with competitive performance. Notably, we find that low-rank PGD often performs comparably to, and sometimes even outperforms, the traditional full-rank PGD attack, while using significantly less memory.

Low-Rank Adversarial PGD Attack

TL;DR

This work observes that in many cases, the perturbations computed using PGD predominantly affect only a portion of the singular value spectrum of the original image, suggesting that these perturbations are approximately low-rank, and proposes a variation of PGD that efficiently computes a low-rank attack.

Abstract

Adversarial attacks on deep neural network models have seen rapid development and are extensively used to study the stability of these networks. Among various adversarial strategies, Projected Gradient Descent (PGD) is a widely adopted method in computer vision due to its effectiveness and quick implementation, making it suitable for adversarial training. In this work, we observe that in many cases, the perturbations computed using PGD predominantly affect only a portion of the singular value spectrum of the original image, suggesting that these perturbations are approximately low-rank. Motivated by this observation, we propose a variation of PGD that efficiently computes a low-rank attack. We extensively validate our method on a range of standard models as well as robust models that have undergone adversarial training. Our analysis indicates that the proposed low-rank PGD can be effectively used in adversarial training due to its straightforward and fast implementation coupled with competitive performance. Notably, we find that low-rank PGD often performs comparably to, and sometimes even outperforms, the traditional full-rank PGD attack, while using significantly less memory.

Paper Structure

This paper contains 20 sections, 10 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: Relative magnitude of singular value change for the PGD-attacked images averaged on 5000 images from CIFAR-10 (in green) and ImageNet dataset (in blue), for WideResNet-28-10, Resnet-50, as well as several robust (adversarially trained) models described in the experiment section
  • Figure 2: Nuclear norms of PGD attacks (10 steps) averaged over 5000 images from CIFAR-10 (left) and ImageNet (right) datasets.
  • Figure 3: Perceivability examples, CIFAR-10 dataset, Wang23 model