CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Shashank Agnihotri; Steffen Jung; Margret Keuper

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Shashank Agnihotri, Steffen Jung, Margret Keuper

TL;DR

CosPGD tackles robustness evaluation for pixel-wise prediction tasks by introducing a per-pixel alignment-based scaling using a differentiable similarity between predictions and targets. This leads to a smooth, balanced attack across the image and improves over traditional PGD and SegPGD in semantic segmentation, optical flow, and image restoration. The method is demonstrated to be versatile, stable, and more effective across a range of tasks and datasets, with open-source code provided. Overall, CosPGD offers a unified, efficient tool for probing adversarial robustness in pixel-level vision problems and highlights the importance of per-pixel alignment in attack design.

Abstract

While neural networks allow highly accurate predictions in many tasks, their lack of robustness towards even slight input perturbations often hampers their deployment. Adversarial attacks such as the seminal projected gradient descent (PGD) offer an effective means to evaluate a model's robustness and dedicated solutions have been proposed for attacks on semantic segmentation or optical flow estimation. While they attempt to increase the attack's efficiency, a further objective is to balance its effect, so that it acts on the entire image domain instead of isolated point-wise predictions. This often comes at the cost of optimization stability and thus efficiency. Here, we propose CosPGD, an attack that encourages more balanced errors over the entire image domain while increasing the attack's overall efficiency. To this end, CosPGD leverages a simple alignment score computed from any pixel-wise prediction and its target to scale the loss in a smooth and fully differentiable way. It leads to efficient evaluations of a model's robustness for semantic segmentation as well as regression models (such as optical flow, disparity estimation, or image restoration), and it allows it to outperform the previous SotA attack on semantic segmentation. We provide code for the CosPGD algorithm and example usage at https://github.com/shashankskagnihotri/cospgd.

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

TL;DR

Abstract

Paper Structure (56 sections, 1 theorem, 15 equations, 21 figures, 15 tables)

This paper contains 56 sections, 1 theorem, 15 equations, 21 figures, 15 tables.

Introduction
Related work
Preliminaries
Prediction Alignment Scaling - CosPGD
Untargeted versus Targeted Attacks.
Choice of $\psi$ and Algorithm Description.
Loss Scaling in Previous Approaches.
Experiments
Stability during Attack Optimization
Spatial Balancing of the Attack
Semantic Segmentation.
Optical Flow.
Benchmarking on Further Tasks and Settings
Semantic Segmentation.
Optical Flow.
...and 41 more sections

Key Result

Proposition 4.1

For any two pixel-wise network predictions $f_\theta(\boldsymbol{X})_i$ and $f_\theta(\boldsymbol{\bar{X}})_i \in\mathbb{R}^{M}$, a target $\boldsymbol{Y}_i\in\mathbb{R}^{M}$ and a continuously differentiable function $\psi:\mathbb{R}^{M}\rightarrow \mathbb{R}^{M}$ with $\|\psi(f_{\theta}(\boldsymb for a real, constant $d\geq 0$.

Figures (21)

Figure 1: Optical flow predictions using RAFT raft on Sintel sintel1sintel2 validation. (a) and (b) show two consecutive frames for which the initial optical flow in (d) was predicted. The results of attacking the model with target $\overrightarrow{0}$ (c) are depicted in (e) for PGD and (f) for CosPGD. For the same perturbation magnitude and number of iterations, the proposed CosPGD alters the estimated optical flow more strongly and brings it closer to target (c).
Figure 2: Change in pixel-wise image gradients over attack iterations on DeepLabV3 performing semantic segmentation on PASCAL VOC 2012 validation subset. We observe that the absolute difference between gradient values (top) is larger for PGD and increasing for SegPGD, while being stable for CosPGD. Further, CosPGD has fewer changes in gradient direction over attack iterations (bottom) compared to PGD and SegPGD. This shows CosPGD is more stable during optimization compared to PGD and SegPGD.
Figure 3: CosPGD versus PGD and SegPGD ($\ell_{\infty}$-norm constrained) for semantic segmentation on PASCAL VOC2012 validation set on DeepLabV3 and PSPNet. CosPGD outperforms competing attacks even in early iterations by a large margin. See also \ref{['tbl:exp:semseg_pgd']} in Appendix \ref{['subsec:appendix:semseg']}.
Figure 4: Example predictions of DeepLabV3 on PASCAL VOC 2012 val set after $\ell_\infty$ PGD, SegPGD, and CosPGD attacks with 40 iters. The ground truth segmentations are given on the left. Both PGD and SegPGD are able to successfully change most of the predicted labels to one of the ground truth labels (here in green). Yet, the region with this label is predicted correctly. Here, only CosPGD also changes the prediction in this region to a third class.
Figure 5: Comparing the distributions of epe w.r.t. Target flow $\overrightarrow{0}$ after $\ell_{\infty}$-norm constrained targeted 40 iterations CosPGD and PGD attacks on RAFT for optical flow estimation over KITTI-2015 validation dataset. A lower epe w.r.t. Target flow is desirable. We observe that CosPGD can reduce the gap to Target for more pixels than the PGD attack. Moreover, the highest epe w.r.t. Target after a CosPGD attack is significantly lower than after a PGD attack.
...and 16 more figures

Theorems & Definitions (2)

Proposition 4.1
proof

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

TL;DR

Abstract

CosPGD: an efficient white-box adversarial attack for pixel-wise prediction tasks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (2)