Rethinking PGD Attack: Is Sign Function Necessary?

Junjie Yang; Tianlong Chen; Xuxi Chen; Zhangyang Wang; Yingbin Liang

Rethinking PGD Attack: Is Sign Function Necessary?

Junjie Yang, Tianlong Chen, Xuxi Chen, Zhangyang Wang, Yingbin Liang

TL;DR

This work questions the necessity of the sign function in $L_\infty$ PGD attacks by analyzing how update rules affect adversarial gains per step. It identifies clipping as a key reason raw gradients underperform and introduces a hidden non-clipped perturbation mechanism, yielding Raw Gradient Descent (RGD) that updates a non-clipped internal state while clipping is applied to the gradient step. The authors provide theoretical bounds on step gains and support them with extensive experiments showing RGD outperforms vanilla PGD and PGD(raw) across datasets, architectures, and training regimes, including adversarial training and transfer attacks, without extra computational cost. These findings offer a practical alternative for robust adversarial generation and reinforce the potential to improve defense via stronger initial perturbations.

Abstract

Neural networks have demonstrated success in various domains, yet their performance can be significantly degraded by even a small input perturbation. Consequently, the construction of such perturbations, known as adversarial attacks, has gained significant attention, many of which fall within "white-box" scenarios where we have full access to the neural network. Existing attack algorithms, such as the projected gradient descent (PGD), commonly take the sign function on the raw gradient before updating adversarial inputs, thereby neglecting gradient magnitude information. In this paper, we present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance, as well as its caveat. We also interpret why previous attempts of directly using raw gradients failed. Based on that, we further propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign. Specifically, we convert the constrained optimization problem into an unconstrained one, by introducing a new hidden variable of non-clipped perturbation that can move beyond the constraint. The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments, outperforming PGD and other competitors in various settings, without incurring any additional computational overhead. The codes is available in https://github.com/JunjieYang97/RGD.

Rethinking PGD Attack: Is Sign Function Necessary?

TL;DR

This work questions the necessity of the sign function in

PGD attacks by analyzing how update rules affect adversarial gains per step. It identifies clipping as a key reason raw gradients underperform and introduces a hidden non-clipped perturbation mechanism, yielding Raw Gradient Descent (RGD) that updates a non-clipped internal state while clipping is applied to the gradient step. The authors provide theoretical bounds on step gains and support them with extensive experiments showing RGD outperforms vanilla PGD and PGD(raw) across datasets, architectures, and training regimes, including adversarial training and transfer attacks, without extra computational cost. These findings offer a practical alternative for robust adversarial generation and reinforce the potential to improve defense via stronger initial perturbations.

Abstract

Paper Structure (21 sections, 3 theorems, 12 equations, 3 figures, 8 tables)

This paper contains 21 sections, 3 theorems, 12 equations, 3 figures, 8 tables.

Introduction
Main Contributions
Related Works
Methodology
Understanding How Update Influences PGD Performance
Theoretical Insights
Empirical Study
Experimental Results
Comparison of Algorithms for Adversarial Attack
Comparison of Algorithms for Adversarial Training
Transfer Attack Study
Ablation and Visualization
Adversarial Perturbation Level Study
Adversarial Update Step Study
Conclusion
...and 6 more sections

Key Result

Theorem 1

Considering $g(x)=\frac{1}{2}[w_1^Th(w_2*x)-y^\ast(x)]^2$, activation function $h(x)$ as ReLU, we define $|\cdot|$ as element wise absolute operation, then the adversarial step gain $g(x+\delta_c^{t+1})-g(x+\delta_c^t)$ is bounded as follows: where $\delta_c^t$ denotes the clipped perturbation in $t$-th step.

Figures (3)

Figure 1: Comparison of update algorithms for perturbation pixel distribution in different steps.
Figure 2: Comparison of robust accuracy of PGD, PGD with raw update and proposed RGD with different $\epsilon$ sizes.
Figure 3: Comparison of robust accuracy of PGD, PGD with raw update and RGD with different update steps when attacking robust ResNet18 model in CIFAR10.

Theorems & Definitions (3)

Theorem 1
Lemma 1
Lemma 2

Rethinking PGD Attack: Is Sign Function Necessary?

TL;DR

Abstract

Rethinking PGD Attack: Is Sign Function Necessary?

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (3)