Table of Contents
Fetching ...

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

Enyi Jiang, Gagandeep Singh

TL;DR

A logit pairing loss is designed to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts, and a novel training framework, \textbf{RAMP}, to boost the robustness against multiple $l_p$ perturbations.

Abstract

Most existing works focus on improving robustness against adversarial attacks bounded by a single $l_p$ norm using adversarial training (AT). However, these AT models' multiple-norm robustness (union accuracy) is still low, which is crucial since in the real-world an adversary is not necessarily bounded by a single norm. The tradeoffs among robustness against multiple $l_p$ perturbations and accuracy/robustness make obtaining good union and clean accuracy challenging. We design a logit pairing loss to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts. We connect natural training (NT) with AT via gradient projection, to incorporate useful information from NT into AT, where we empirically and theoretically show it moderates the accuracy/robustness tradeoff. We propose a novel training framework \textbf{RAMP}, to boost the robustness against multiple $l_p$ perturbations. \textbf{RAMP} can be easily adapted for robust fine-tuning and full AT. For robust fine-tuning, \textbf{RAMP} obtains a union accuracy up to $53.3\%$ on CIFAR-10, and $29.1\%$ on ImageNet. For training from scratch, \textbf{RAMP} achieves a union accuracy of $44.6\%$ and good clean accuracy of $81.2\%$ on ResNet-18 against AutoAttack on CIFAR-10. Beyond multi-norm robustness \textbf{RAMP}-trained models achieve superior \textit{universal robustness}, effectively generalizing against a range of unseen adversaries and natural corruptions.

RAMP: Boosting Adversarial Robustness Against Multiple $l_p$ Perturbations for Universal Robustness

TL;DR

A logit pairing loss is designed to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts, and a novel training framework, \textbf{RAMP}, to boost the robustness against multiple perturbations.

Abstract

Most existing works focus on improving robustness against adversarial attacks bounded by a single norm using adversarial training (AT). However, these AT models' multiple-norm robustness (union accuracy) is still low, which is crucial since in the real-world an adversary is not necessarily bounded by a single norm. The tradeoffs among robustness against multiple perturbations and accuracy/robustness make obtaining good union and clean accuracy challenging. We design a logit pairing loss to improve the union accuracy by analyzing the tradeoffs from the lens of distribution shifts. We connect natural training (NT) with AT via gradient projection, to incorporate useful information from NT into AT, where we empirically and theoretically show it moderates the accuracy/robustness tradeoff. We propose a novel training framework \textbf{RAMP}, to boost the robustness against multiple perturbations. \textbf{RAMP} can be easily adapted for robust fine-tuning and full AT. For robust fine-tuning, \textbf{RAMP} obtains a union accuracy up to on CIFAR-10, and on ImageNet. For training from scratch, \textbf{RAMP} achieves a union accuracy of and good clean accuracy of on ResNet-18 against AutoAttack on CIFAR-10. Beyond multi-norm robustness \textbf{RAMP}-trained models achieve superior \textit{universal robustness}, effectively generalizing against a range of unseen adversaries and natural corruptions.
Paper Structure (28 sections, 6 theorems, 43 equations, 9 figures, 25 tables, 2 algorithms)

This paper contains 28 sections, 6 theorems, 43 equations, 9 figures, 25 tables, 2 algorithms.

Key Result

Theorem 4.5

When the model dimension $m \to \infty$, for an epoch $t$, we have an approximation of the error difference $\Delta^2_{{AT}} - \Delta^2_{{GP}}$ as follows $\bar{\tau}^2= \mathbb{E}_\pi[\tau^2] \in [0, 1]$, where $\tau (\theta)$ is the $\sin(\cdot)$ value of the angle between $\widehat{g}_{n}$ and $g_{a}- \widehat{g}_{n}$.

Figures (9)

  • Figure 1: Multiple-norm tradeoff with robust fine-tuning: We observe that fine-tuning on $l_\infty$-AT model using $l_1$ examples drastically reduces $l_\infty$ robustness. RAMP preserves more $l_\infty$ and union robustness.
  • Figure 2: $l_\infty$AT-GP with PGD madry2017towards with $\epsilon=0.031$ on CIFAR-10 improves accuracy and robustness. Pre-training on $\widehat{{\mathcal{D}}}_n$ for $50$ epochs further boosts the performance.
  • Figure 3: $l_\infty$AT-GP with APGD croce2020aa improves robustness against $l_\infty$ AutoAttack croce2020aa with $\epsilon = \frac{8}{255}$. RN-18 $l_\infty$-GP uses AT-GP; RN-18 $l_\infty$-GP-pre pre-trains $40$ epochs on $\widehat{{\mathcal{D}}}_n$ before AT-GP is applied.
  • Figure 4: Alabtion studies on $\lambda$ and $\beta$ hyper-parameters.
  • Figure 5: Plot of values of terms $E_{\widehat{D}_{a^t}} \|g_{a} - \widehat{g_{a}} \|^2_\pi$ (variance), $\|g_{a} - \widehat{g}_{n}\|^2_\pi$ (bias), $\bar{\tau}$, and $\Delta^2_{AT} - \Delta^2_{GP}$ (error differences).
  • ...and 4 more figures

Theorems & Definitions (15)

  • Definition 4.2: Aggregation for NT and AT
  • Definition 4.3: $L^\pi$-Norm anonymous2024principled
  • Definition 4.4: Delta Error of an aggregation rule Aggr$(\cdot)$
  • Theorem 4.5: Error Analysis of GP
  • Theorem A.1
  • proof
  • Theorem A.2: Convergence of Aggr$(\cdot)$
  • proof
  • Definition A.3: GP Aggregation
  • Definition A.4: AT Aggregation
  • ...and 5 more