Table of Contents
Fetching ...

Fine-Grained Iterative Adversarial Attacks with Limited Computation Budget

Zhichao Hou, Weizhi Gao, Xiaorui Liu

TL;DR

This paper tackles the problem of maximizing the strength of iterative adversarial attacks under a fixed compute budget by introducing a fine-grained, layer- and iteration-wise activation recomputation scheme called Spiking Iterative Attack. It combines a spiking forward mechanism controlled by a threshold $\rho$ with a virtual surrogate gradient to preserve meaningful backward signals when activations are reused, and it models the attack as a combinatorial optimization over a mask $\Delta \in \{0,1\}^{T\times L}$ under budget $C_{\rm total}$. The authors prove that coarse early stopping is a subcase of the fine-grained formulation and demonstrate, through experiments on vision (CIFAR-10/100, Tiny-ImageNet) and graph benchmarks (Cora, Citeseer), that Spiking-PGD outperforms baselines at equal cost and enables adversarial training with substantially reduced budget (up to about 70% savings) without sacrificing accuracy. This approach expands the efficiency–effectiveness frontier for robustness research, supporting scalable evaluation and training for large models under limited resources. Key innovations include the identification of redundancy in iterative attacks, the per-layer and per-iteration masking formulation, and the surrogate gradient mechanism that maintains gradient flow despite activation reuse, all formalized with the budgeted optimization framework and validated across domains.

Abstract

This work tackles a critical challenge in AI safety research under limited compute: given a fixed computation budget, how can one maximize the strength of iterative adversarial attacks? Coarsely reducing the number of attack iterations lowers cost but substantially weakens effectiveness. To fulfill the attainable attack efficacy within a constrained budget, we propose a fine-grained control mechanism that selectively recomputes layer activations across both iteration-wise and layer-wise levels. Extensive experiments show that our method consistently outperforms existing baselines at equal cost. Moreover, when integrated into adversarial training, it attains comparable performance with only 30% of the original budget.

Fine-Grained Iterative Adversarial Attacks with Limited Computation Budget

TL;DR

This paper tackles the problem of maximizing the strength of iterative adversarial attacks under a fixed compute budget by introducing a fine-grained, layer- and iteration-wise activation recomputation scheme called Spiking Iterative Attack. It combines a spiking forward mechanism controlled by a threshold with a virtual surrogate gradient to preserve meaningful backward signals when activations are reused, and it models the attack as a combinatorial optimization over a mask under budget . The authors prove that coarse early stopping is a subcase of the fine-grained formulation and demonstrate, through experiments on vision (CIFAR-10/100, Tiny-ImageNet) and graph benchmarks (Cora, Citeseer), that Spiking-PGD outperforms baselines at equal cost and enables adversarial training with substantially reduced budget (up to about 70% savings) without sacrificing accuracy. This approach expands the efficiency–effectiveness frontier for robustness research, supporting scalable evaluation and training for large models under limited resources. Key innovations include the identification of redundancy in iterative attacks, the per-layer and per-iteration masking formulation, and the surrogate gradient mechanism that maintains gradient flow despite activation reuse, all formalized with the budgeted optimization framework and validated across domains.

Abstract

This work tackles a critical challenge in AI safety research under limited compute: given a fixed computation budget, how can one maximize the strength of iterative adversarial attacks? Coarsely reducing the number of attack iterations lowers cost but substantially weakens effectiveness. To fulfill the attainable attack efficacy within a constrained budget, we propose a fine-grained control mechanism that selectively recomputes layer activations across both iteration-wise and layer-wise levels. Extensive experiments show that our method consistently outperforms existing baselines at equal cost. Moreover, when integrated into adversarial training, it attains comparable performance with only 30% of the original budget.

Paper Structure

This paper contains 20 sections, 1 theorem, 7 equations, 14 figures, 3 tables, 2 algorithms.

Key Result

Proposition 4.1

Let $V_{\mathrm{coarse}}$ and $V_{\mathrm{fine}}$ denote the optimal objective values of Eq. (eq:pgd_problem) and Eq. (eq:spike_pgd_problem), respectively, then we have: $V_{\mathrm{coarse}} \le V_{\mathrm{fine}}.$ The feasible set of Eq. (eq:pgd_problem) can be embedded into the feasible set of Eq.

Figures (14)

  • Figure 1: Activation relative change $\|{\bm{a}}_t-{\bm{a}}_{t-1}\|/\|{\bm{a}}_t\|$ for ResNet-18 on CIFAR-10. Left: per-layer curves (light color) and layer average (dark color) for normally trained (red) and adversarially trained (blue) models. Right: heatmap of relative change across layers and attack iterations.
  • Figure 2: Three attack execution patterns: (a) Vanilla iterative attack: every layer $l \in [L]$ is fully computed at every attack iteration $t \in [T]$; (b) Coarse-grained attack: all layers are computed for $t\le S$ and skipped for $t>S$; and (c) Fine-grained attack: selectively controls the computation scheme to maximize attack strength under a given computational budget.
  • Figure 3: Comparison of vanilla forward computation and spiking forward computation for adversarial attacks. Left: In vanilla forward computation, the current output is composed of the previous output and the activation residual, leveraging the linearity of ${\mathcal{A}}^{(l)}$. Right: Spiking forward computation applies a spiking function $S_\rho$ to the residual to selectively update the activations.
  • Figure 4: Gradient computation mechanisms. (a) Exact gradient $\frac{\partial \mathcal{L}}{\partial {\bm{a}}_t} = {\mathcal{A}}^\top \left(\frac{\partial \mathcal{L}}{\partial \hat{{\bm{o}}}_t}\right)$: when the layer output $\hat{{\bm{o}}}_t$ is freshly computed, the gradient w.r.t. ${\bm{a}}_t$ follows the standard chain rule. (b) Vanishing gradient $\frac{\partial \mathcal{L}}{\partial {\bm{a}}_t} =\mathbf{0} \cdot\frac{\partial \mathcal{L}}{\partial \hat{{\bm{o}}}_t} = \mathbf{0}$: when $\hat{{\bm{o}}}_t$ is reused from $\hat{{\bm{o}}}_{t-1}$, the spiking gate blocks the path to ${\bm{a}}_t$, yielding zero gradient. (c) Virtual gradient $\frac{\partial \mathcal{L}}{\partial {\bm{a}}_t} = {\mathcal{A}}^\top \left(\frac{\partial \mathcal{L}}{\partial \hat{{\bm{o}}}_{t-1}}\right)$: our virtual surrogate gradient manually restores the backward path by manually applying ${\mathcal{A}}^\top$ to the upstream gradient.
  • Figure 5: Comparison of model accuracy under attack versus computation cost on CIFAR-10, CIFAR-100, and Tiny-ImageNet. Spiking-PGD consistently achieves better attack strength than baseline iterative attacks (I-FGSM, MI-FGSM, PGD), with the performance gap most pronounced in the low-computation regime.
  • ...and 9 more figures

Theorems & Definitions (3)

  • Proposition 4.1
  • Remark 4.2
  • proof