RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

Enyi Jiang; Gagandeep Singh

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

Enyi Jiang, Gagandeep Singh

TL;DR

A logit pairing loss is designed to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts, and a novel training framework, \textbf{RAMP}, to boost the robustness against multiple $l_p$ perturbations.

Abstract

Most existing works focus on improving robustness against adversarial attacks bounded by a single $l_p$ norm using adversarial training (AT). However, these AT models' multiple-norm robustness (union accuracy) is still low, which is crucial since in the real-world an adversary is not necessarily bounded by a single norm. The tradeoffs among robustness against multiple $l_p$ perturbations and accuracy/robustness make obtaining good union and clean accuracy challenging. We design a logit pairing loss to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts. We connect natural training (NT) with AT via gradient projection, to incorporate useful information from NT into AT, where we empirically and theoretically show it moderates the accuracy/robustness tradeoff. We propose a novel training framework \textbf{RAMP}, to boost the robustness against multiple $l_p$ perturbations. \textbf{RAMP} can be easily adapted for robust fine-tuning and full AT. For robust fine-tuning, \textbf{RAMP} obtains a union accuracy up to $53.3\%$ on CIFAR-10, and $29.1\%$ on ImageNet. For training from scratch, \textbf{RAMP} achieves a union accuracy of $44.6\%$ and good clean accuracy of $81.2\%$ on ResNet-18 against AutoAttack on CIFAR-10. Beyond multi-norm robustness \textbf{RAMP}-trained models achieve superior \textit{universal robustness}, effectively generalizing against a range of unseen adversaries and natural corruptions.

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

TL;DR

perturbations.

Abstract

Most existing works focus on improving robustness against adversarial attacks bounded by a single

norm using adversarial training (AT). However, these AT models' multiple-norm robustness (union accuracy) is still low, which is crucial since in the real-world an adversary is not necessarily bounded by a single norm. The tradeoffs among robustness against multiple

perturbations and accuracy/robustness make obtaining good union and clean accuracy challenging. We design a logit pairing loss to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts. We connect natural training (NT) with AT via gradient projection, to incorporate useful information from NT into AT, where we empirically and theoretically show it moderates the accuracy/robustness tradeoff. We propose a novel training framework \textbf{RAMP}, to boost the robustness against multiple

perturbations. \textbf{RAMP} can be easily adapted for robust fine-tuning and full AT. For robust fine-tuning, \textbf{RAMP} obtains a union accuracy up to

on CIFAR-10, and

on ImageNet. For training from scratch, \textbf{RAMP} achieves a union accuracy of

and good clean accuracy of

on ResNet-18 against AutoAttack on CIFAR-10. Beyond multi-norm robustness \textbf{RAMP}-trained models achieve superior \textit{universal robustness}, effectively generalizing against a range of unseen adversaries and natural corruptions.

Paper Structure (28 sections, 6 theorems, 43 equations, 9 figures, 25 tables, 2 algorithms)

This paper contains 28 sections, 6 theorems, 43 equations, 9 figures, 25 tables, 2 algorithms.

Introduction
Related Work
AT against Multiple Perturbations
RAMP
Logit Pairing for Multiple Perturbations
Connecting Natural Training with AT
Theoretical Analysis of GP for Adversarial Robustness
Experiment
Main Results
Ablation Study and Discussion
Conclusion
Proof of Theorems
Proof of Theorem \ref{['thm:convergence']}
Proof of Theorem \ref{['thm:error-GP']}
Additional Experiment Information
...and 13 more sections

Key Result

Theorem 4.5

When the model dimension $m \to \infty$, for an epoch $t$, we have an approximation of the error difference $\Delta^2_{{AT}} - \Delta^2_{{GP}}$ as follows $\bar{\tau}^2= \mathbb{E}_\pi[\tau^2] \in [0, 1]$, where $\tau (\theta)$ is the $\sin(\cdot)$ value of the angle between $\widehat{g}_{n}$ and $g_{a}- \widehat{g}_{n}$.

Figures (9)

Figure 1: Multiple-norm tradeoff with robust fine-tuning: We observe that fine-tuning on $l_\infty$-AT model using $l_1$ examples drastically reduces $l_\infty$ robustness. RAMP preserves more $l_\infty$ and union robustness.
Figure 2: $l_\infty$AT-GP with PGD madry2017towards with $\epsilon=0.031$ on CIFAR-10 improves accuracy and robustness. Pre-training on $\widehat{{\mathcal{D}}}_n$ for $50$ epochs further boosts the performance.
Figure 3: $l_\infty$AT-GP with APGD croce2020aa improves robustness against $l_\infty$ AutoAttack croce2020aa with $\epsilon = \frac{8}{255}$. RN-18 $l_\infty$-GP uses AT-GP; RN-18 $l_\infty$-GP-pre pre-trains $40$ epochs on $\widehat{{\mathcal{D}}}_n$ before AT-GP is applied.
Figure 4: Alabtion studies on $\lambda$ and $\beta$ hyper-parameters.
Figure 5: Plot of values of terms $E_{\widehat{D}_{a^t}} \|g_{a} - \widehat{g_{a}} \|^2_\pi$ (variance), $\|g_{a} - \widehat{g}_{n}\|^2_\pi$ (bias), $\bar{\tau}$, and $\Delta^2_{AT} - \Delta^2_{GP}$ (error differences).
...and 4 more figures

Theorems & Definitions (15)

Definition 4.2: Aggregation for NT and AT
Definition 4.3: $L^\pi$-Norm anonymous2024principled
Definition 4.4: Delta Error of an aggregation rule Aggr$(\cdot)$
Theorem 4.5: Error Analysis of GP
Theorem A.1
proof
Theorem A.2: Convergence of Aggr$(\cdot)$
proof
Definition A.3: GP Aggregation
Definition A.4: AT Aggregation
...and 5 more

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

TL;DR

Abstract

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (15)