Table of Contents
Fetching ...

Pareto Adversarial Robustness: Balancing Spatial Robustness and Sensitivity-based Robustness

Ke Sun, Mingjie Li, Zhouchen Lin

TL;DR

To reconcile the interplay between the mutual impacts of various robustness components into one unified framework, the Pareto criterion is incorporated into the adversarial robustness analysis, yielding a novel strategy called Pareto adversarial training for achieving universal robustness.

Abstract

Adversarial robustness, which primarily comprises sensitivity-based robustness and spatial robustness, plays an integral part in achieving robust generalization. In this paper, we endeavor to design strategies to achieve universal adversarial robustness. To achieve this, we first investigate the relatively less-explored realm of spatial robustness. Then, we integrate the existing spatial robustness methods by incorporating both local and global spatial vulnerability into a unified spatial attack and adversarial training approach. Furthermore, we present a comprehensive relationship between natural accuracy, sensitivity-based robustness, and spatial robustness, supported by strong evidence from the perspective of robust representation. Crucially, to reconcile the interplay between the mutual impacts of various robustness components into one unified framework, we incorporate the \textit{Pareto criterion} into the adversarial robustness analysis, yielding a novel strategy called Pareto Adversarial Training for achieving universal robustness. The resulting Pareto front, which delineates the set of optimal solutions, provides an optimal balance between natural accuracy and various adversarial robustness. This sheds light on solutions for achieving universal robustness in the future. To the best of our knowledge, we are the first to consider universal adversarial robustness via multi-objective optimization.

Pareto Adversarial Robustness: Balancing Spatial Robustness and Sensitivity-based Robustness

TL;DR

To reconcile the interplay between the mutual impacts of various robustness components into one unified framework, the Pareto criterion is incorporated into the adversarial robustness analysis, yielding a novel strategy called Pareto adversarial training for achieving universal robustness.

Abstract

Adversarial robustness, which primarily comprises sensitivity-based robustness and spatial robustness, plays an integral part in achieving robust generalization. In this paper, we endeavor to design strategies to achieve universal adversarial robustness. To achieve this, we first investigate the relatively less-explored realm of spatial robustness. Then, we integrate the existing spatial robustness methods by incorporating both local and global spatial vulnerability into a unified spatial attack and adversarial training approach. Furthermore, we present a comprehensive relationship between natural accuracy, sensitivity-based robustness, and spatial robustness, supported by strong evidence from the perspective of robust representation. Crucially, to reconcile the interplay between the mutual impacts of various robustness components into one unified framework, we incorporate the \textit{Pareto criterion} into the adversarial robustness analysis, yielding a novel strategy called Pareto Adversarial Training for achieving universal robustness. The resulting Pareto front, which delineates the set of optimal solutions, provides an optimal balance between natural accuracy and various adversarial robustness. This sheds light on solutions for achieving universal robustness in the future. To the best of our knowledge, we are the first to consider universal adversarial robustness via multi-objective optimization.

Paper Structure

This paper contains 21 sections, 2 theorems, 21 equations, 8 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

Consider $\mathcal{L}^{S}_{\theta}(x, y)=\log \sum_{i \neq y} \exp \left(f_{\theta}^{i}(x)\right)-f_{\theta}^{y}(x)$ as the smooth version loss of Eq. eq_flow without a local smoothness term. For a fixed $(x_{w_F},y)$ and $\theta$, we have where $r(x_{w_F}, y)= \sum_{i \neq y} \exp \left(f_{\theta}^{i}(x_{w_F})\right) / \sum_{i} \exp \left(f_{\theta}^{i}(x_{w_F})\right)$.

Figures (8)

  • Figure 1: Visualization of Flow-based, RT and Our Integrated Spatial adversarial examples on MNIST, CIFAR-$10$ and Caltech-$256$. More images and detailed discussions are provided in \ref{['appendix:moreimages']}.
  • Figure 2: Loss landscape of Integrated Spatial Attack on CIFAR-$10$. (Left) A distant view of loss landscape w.r.t $w$ before the optimization in Eq. \ref{['eq_integrated_optimization']}. (Middle) A close view before the optimization shows a highly convex surface near the initialization point. (Right) The loss landscape around the maxima $w^*$ after the optimization in Eq. \ref{['eq_integrated_optimization']}.
  • Figure 3: Relationships between sensitivity and two spatial robustness for three datasets. The X-axis represents adversarially PGD-trained models under different numbers of PGD iterations to measure the strength of sensitivity-based robustness, while the Y-axis represents the test accuracy under Flow Attack (red) and RT Attack (blue) with different iterations to measure the spatial robustness.
  • Figure 4: Saliency maps of four types of training models on some randomly selected images on Caltech-$256$.
  • Figure 5: Median of skewness of saliency maps difference among robust models across all test data as compared with other models. The first three sub-pictures are compared with the naturally trained model, while the last one is compared with the PGD-trained model.
  • ...and 3 more figures

Theorems & Definitions (5)

  • Proposition 1
  • Proposition 2
  • proof
  • proof
  • proof