Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation

Xuyang Zhong; Chen Liu

Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation

Xuyang Zhong, Chen Liu

TL;DR

This paper tackles sparse adversarial perturbations, both unstructured ($\ell_0$) and structured (group $\ell_0$), by introducing Sparse-PGD (sPGD), a white-box attack that decomposes perturbations as $\bm{\delta} = \mathbf{p} \odot \mathbf{m}$ and optimizes $\mathbf{p}$ with gradient steps while relaxing and projecting the binary mask $\mathbf{m}$ via a continuous surrogate $\widetilde{\mathbf{m}}$. It further extends to structured sparsity through a group mask $\mathbf{v}$, a transposed-convolution-based mapping to the pixel mask, and a surrogate group $\ell_0$ norm $\Omega'_0$ to enable optimization. To provide robust assessment, the authors propose Sparse-AutoAttack (sAA), an ensemble of white-box and black-box sparse attacks, and demonstrate state-of-the-art effectiveness against both unstructured and structured sparse perturbations across CIFAR-10, ImageNet-100, and beyond. They also integrate sPGD into adversarial training (sAT, sTRADES), achieving strong robustness against diverse sparse perturbations with efficient training. The results have practical impact for deploying robust vision systems under realistic sparse perturbation scenarios, including detection/segmentation tasks, and even adversarial watermarking, with public code available.

Abstract

This work studies sparse adversarial perturbations, including both unstructured and structured ones. We propose a framework based on a white-box PGD-like attack method named Sparse-PGD to effectively and efficiently generate such perturbations. Furthermore, we combine Sparse-PGD with a black-box attack to comprehensively and more reliably evaluate the models' robustness against unstructured and structured sparse adversarial perturbations. Moreover, the efficiency of Sparse-PGD enables us to conduct adversarial training to build robust models against various sparse perturbations. Extensive experiments demonstrate that our proposed attack algorithm exhibits strong performance in different scenarios. More importantly, compared with other robust models, our adversarially trained model demonstrates state-of-the-art robustness against various sparse attacks. Codes are available at https://github.com/CityU-MLO/sPGD.

Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation

TL;DR

This paper tackles sparse adversarial perturbations, both unstructured (

) and structured (group

), by introducing Sparse-PGD (sPGD), a white-box attack that decomposes perturbations as

and optimizes

with gradient steps while relaxing and projecting the binary mask

via a continuous surrogate

. It further extends to structured sparsity through a group mask

, a transposed-convolution-based mapping to the pixel mask, and a surrogate group

norm

to enable optimization. To provide robust assessment, the authors propose Sparse-AutoAttack (sAA), an ensemble of white-box and black-box sparse attacks, and demonstrate state-of-the-art effectiveness against both unstructured and structured sparse perturbations across CIFAR-10, ImageNet-100, and beyond. They also integrate sPGD into adversarial training (sAT, sTRADES), achieving strong robustness against diverse sparse perturbations with efficient training. The results have practical impact for deploying robust vision systems under realistic sparse perturbation scenarios, including detection/segmentation tasks, and even adversarial watermarking, with public code available.

Abstract

Paper Structure (24 sections, 1 theorem, 10 equations, 9 figures, 19 tables, 2 algorithms)

This paper contains 24 sections, 1 theorem, 10 equations, 9 figures, 19 tables, 2 algorithms.

Introduction
Related Works
Unstructured Sparse Adversarial Attack
Sparse-PGD (sPGD)
Sparse-AutoAttack (sAA)
Structured Sparse Adversarial Attack
Formulation of Structured Sparsity
sPGD for Structured Sparse Perturbations
sAA for Structured Sparse Perturbations
Adversarial Training
Experiments
Robustness against Unstructured Sparse Perturbations
Robustness against Structured Sparse Perturbations
Adversarial Watermarks
Comparison under Different Iteration Numbers and Different Sparsity Levels
...and 9 more sections

Key Result

Theorem 4.1

Given an input ${\bm{x}}\in\mathbb{R}^d$, a set of groups $\mathcal{G}=\{G_j\}_{j = 1}^N$ and any vector ${\bm{v}}$ satisfying the constraint in the definition of $\Omega'_0$, then we have $\Omega_0({\bm{x}}) \leq \Omega_0'({\bm{x}},{\bm{v}}) \leq \|{\bm{v}}\|_0$.

Figures (9)

Figure 1: The ratio between the group $l_0$ norm and the approximated group $l_0$ norm. The x-axis is the approximated group $l_0$ norm ranging from 1 to 25, and the y-axis is the ratio. We plot the ratios of $3\times 3$ patch perturbations on CIFAR-10 ($32 \times 32$) and $10 \times 10$ patch perturbations on ImageNet-100 ($224 \times 224$), respectively. The results are calculated on 100 samples. The solid line and shadow denote the mean value and standard deviation, respectively.
Figure 2: Pipeline of sPGD for structured sparse perturbations. The continuous group mask $\widetilde{{\bm{v}}}$ is first projected to get the binary group mask ${\bm{v}}$ to ensure $\|{\bm{v}}\|_0 \leq \epsilon$, which is similar to Eq. (\ref{['eq:m_project']}). Given the kernel ${\bm{k}} \in \{0, 1\}^{r \times r}$ with a customized pattern, we can transform ${\bm{v}}$ to the pixel mask ${\bm{m}}$ using transposed convolution and clipping (see Eq. (\ref{['eq:map']})). Finally, we element-wisely multiply ${\bm{m}}$ with the dense perturbation ${\bm{p}}$ to obtain the structured sparse perturbation ${\boldsymbol{\delta}}$. Note that the continuous group mask $\widetilde{{\bm{v}}}$ and the dense perturbation ${\bm{p}}$ are learnable.
Figure 3: Comparison between sPGD and RS attack under different iteration numbers and different sparsity levels. (a) Different iteration number comparison on CIFAR-10, $\epsilon=20$. ResNet18 (std), PORT ($l_\infty$ and $l_2$) sehwag2021, $l_1$-APGD ($l_1$) croce2021mind and sTRADES ($l_0$) are evaluated. (b) Different iteration number comparison on ImageNet-100, $\epsilon=200$. ResNet34 (std), Fast-EG-$l_1$ ($l_1$) jiang2023towards and sAT ($l_0$) are evaluated. In (a) and (b), the total iteration number ranges from $20$ and $10000$. For better visualization, the x-axis is in the log scale. (c) Different sparsity comparison on CIFAR-10. The evaluated models are the same as those in (a). The $\epsilon$ ranges from $0$ and $50$. The number of total iterations is set to $10000$. Note that the results of sPGD and RS attack are shown in solid lines and dotted lines, respectively.
Figure 4: Distribution of the iteration numbers needed by sPGD (blue) and RS (orange) to successfully generate adversarial samples. The results are obtained from different models: (a)$l_\infty$ PORT sehwag2021, (b)$l_1$-APGD croce2021mind and (c) our sPGD. The average iteration numbers (Avg.) and attack success rate (ASR), i.e., $1-$Robust Acc., are reported in the legend. For better visualization, we clip the minimum iteration number to $10$ and show the x- and y-axis in log scale.
Figure 5: Visualization of different sparse adversarial examples in ImageNet-100. The model is vanilla ResNet-34. The predictions before (left) and after (right) attack are (a) cock$\rightarrow$hen, (b) triceratops$\rightarrow$spotted salamander, (c) great white shark$\rightarrow$tench, (d) loggerhead$\rightarrow$leatherback turtle, (e) black and gold garden spider$\rightarrow$garden spider, (f) lorikeet$\rightarrow$macaw, (g) bee eater$\rightarrow$coucal, and (h) banded gecko$\rightarrow$alligator lizard. Note that the red squares in (g) and (h) are just for highlighting the perturbation position and not part of perturbations.
...and 4 more figures

Theorems & Definitions (2)

Theorem 4.1
proof

Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation

TL;DR

Abstract

Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (2)