Sparse-PGD: A Unified Framework for Sparse Adversarial Perturbations Generation
Xuyang Zhong, Chen Liu
TL;DR
This paper tackles sparse adversarial perturbations, both unstructured ($\ell_0$) and structured (group $\ell_0$), by introducing Sparse-PGD (sPGD), a white-box attack that decomposes perturbations as $\bm{\delta} = \mathbf{p} \odot \mathbf{m}$ and optimizes $\mathbf{p}$ with gradient steps while relaxing and projecting the binary mask $\mathbf{m}$ via a continuous surrogate $\widetilde{\mathbf{m}}$. It further extends to structured sparsity through a group mask $\mathbf{v}$, a transposed-convolution-based mapping to the pixel mask, and a surrogate group $\ell_0$ norm $\Omega'_0$ to enable optimization. To provide robust assessment, the authors propose Sparse-AutoAttack (sAA), an ensemble of white-box and black-box sparse attacks, and demonstrate state-of-the-art effectiveness against both unstructured and structured sparse perturbations across CIFAR-10, ImageNet-100, and beyond. They also integrate sPGD into adversarial training (sAT, sTRADES), achieving strong robustness against diverse sparse perturbations with efficient training. The results have practical impact for deploying robust vision systems under realistic sparse perturbation scenarios, including detection/segmentation tasks, and even adversarial watermarking, with public code available.
Abstract
This work studies sparse adversarial perturbations, including both unstructured and structured ones. We propose a framework based on a white-box PGD-like attack method named Sparse-PGD to effectively and efficiently generate such perturbations. Furthermore, we combine Sparse-PGD with a black-box attack to comprehensively and more reliably evaluate the models' robustness against unstructured and structured sparse adversarial perturbations. Moreover, the efficiency of Sparse-PGD enables us to conduct adversarial training to build robust models against various sparse perturbations. Extensive experiments demonstrate that our proposed attack algorithm exhibits strong performance in different scenarios. More importantly, compared with other robust models, our adversarially trained model demonstrates state-of-the-art robustness against various sparse attacks. Codes are available at https://github.com/CityU-MLO/sPGD.
