Agnostic Multi-Robust Learning Using ERM

Saba Ahmadi; Avrim Blum; Omar Montasser; Kevin Stangl

Agnostic Multi-Robust Learning Using ERM

Saba Ahmadi, Avrim Blum, Omar Montasser, Kevin Stangl

TL;DR

This paper tackles the problem of agnostic (non-realizable) robust learning under patch-like adversarial perturbations. It shows that naive ERM on augmented data can fail when zero robust error is impossible, and provides a Feige et al.-style reduction that yields robust guarantees using only an ERM oracle, with a bound on the expected robust loss of the majority vote of T predictors. The authors extend the framework to a multi-group setting, introducing a two-layer boosting method that achieves low robust loss across multiple disjoint groups, with randomized and deterministic (majority) guarantees. They establish generalization bounds based on the VC dimension of the robust-loss class and the group class, yielding sample complexities that scale with those complexities and logarithmic factors in the perturbation budget k. Overall, the work provides a theoretical pathway to obtain robust performance via ERM-based training combined with boosting, including a novel multi-robustness objective that distributes robust performance across diverse subgroups without requiring test-time group membership.

Abstract

A fundamental problem in robust learning is asymmetry: a learner needs to correctly classify every one of exponentially-many perturbations that an adversary might make to a test-time natural example. In contrast, the attacker only needs to find one successful perturbation. Xiang et al.[2022] proposed an algorithm that in the context of patch attacks for image classification, reduces the effective number of perturbations from an exponential to a polynomial number of perturbations and learns using an ERM oracle. However, to achieve its guarantee, their algorithm requires the natural examples to be robustly realizable. This prompts the natural question; can we extend their approach to the non-robustly-realizable case where there is no classifier with zero robust error? Our first contribution is to answer this question affirmatively by reducing this problem to a setting in which an algorithm proposed by Feige et al.[2015] can be applied, and in the process extend their guarantees. Next, we extend our results to a multi-group setting and introduce a novel agnostic multi-robust learning problem where the goal is to learn a predictor that achieves low robust loss on a (potentially) rich collection of subgroups.

Agnostic Multi-Robust Learning Using ERM

TL;DR

Abstract

Paper Structure (28 sections, 13 theorems, 63 equations, 1 figure, 1 algorithm)

This paper contains 28 sections, 13 theorems, 63 equations, 1 figure, 1 algorithm.

Introduction
Our Contributions
Related Work
Patch Attacks
Adversarial Learning using $\textsf{ERM}\xspace$
Multi-group Learning
Setup and Notation
Minimizing Robust Loss Using an ERM Oracle
Comparison with prior related work
Proof of thm:generalization-FMS
Multi-robustness guarantees on a set of groups
Summary of Results.
Comparison to Prior Work on Multi-group Learning
Boosting algorithm achieving multi-robustness guarantees:
Generalization Guarantees
...and 13 more sections

Key Result

Theorem 1

Set $T(\varepsilon) = \frac{32 \ln k}{\varepsilon^2}$ and $m(\varepsilon, \delta) = O\left ( \frac{{\rm vc}(\mathcal{H})(\ln k)^2}{\varepsilon^4}\ln \left ( \frac{\ln k}{\varepsilon^2} \right )+\frac{\ln(1/\delta)}{\varepsilon^2} \right )$. Then, for any distribution $\mathcal{D}$ over $\mathcal{X}\ where ${\rm MAJ}(h_1,\dots, h_{T(\varepsilon)})$ shows the majority-vote of predictors $h_1,\dots,

Figures (1)

Figure 1: $\textsf{ERM}\xspace$ failure mode in the robustly un-realizable case. Blue, red, and black points show respectively original examples with a positive label, original examples with a negative label, and perturbations of original examples.

Theorems & Definitions (34)

Example 1
Theorem 1
Remark 1
Lemma 2
Lemma 3: *DBLP:conf/colt/FeigeMS15
Lemma 4: VC Dimension for the Robust Loss attias2022improved
Definition 1: Multi-Robustness
Definition 2: $\beta$-Multi-Robustness
Definition 3: Multi-Robustness on Average
Remark 2
...and 24 more

Agnostic Multi-Robust Learning Using ERM

TL;DR

Abstract

Agnostic Multi-Robust Learning Using ERM

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (34)