Agnostic Multi-Robust Learning Using ERM
Saba Ahmadi, Avrim Blum, Omar Montasser, Kevin Stangl
TL;DR
This paper tackles the problem of agnostic (non-realizable) robust learning under patch-like adversarial perturbations. It shows that naive ERM on augmented data can fail when zero robust error is impossible, and provides a Feige et al.-style reduction that yields robust guarantees using only an ERM oracle, with a bound on the expected robust loss of the majority vote of T predictors. The authors extend the framework to a multi-group setting, introducing a two-layer boosting method that achieves low robust loss across multiple disjoint groups, with randomized and deterministic (majority) guarantees. They establish generalization bounds based on the VC dimension of the robust-loss class and the group class, yielding sample complexities that scale with those complexities and logarithmic factors in the perturbation budget k. Overall, the work provides a theoretical pathway to obtain robust performance via ERM-based training combined with boosting, including a novel multi-robustness objective that distributes robust performance across diverse subgroups without requiring test-time group membership.
Abstract
A fundamental problem in robust learning is asymmetry: a learner needs to correctly classify every one of exponentially-many perturbations that an adversary might make to a test-time natural example. In contrast, the attacker only needs to find one successful perturbation. Xiang et al.[2022] proposed an algorithm that in the context of patch attacks for image classification, reduces the effective number of perturbations from an exponential to a polynomial number of perturbations and learns using an ERM oracle. However, to achieve its guarantee, their algorithm requires the natural examples to be robustly realizable. This prompts the natural question; can we extend their approach to the non-robustly-realizable case where there is no classifier with zero robust error? Our first contribution is to answer this question affirmatively by reducing this problem to a setting in which an algorithm proposed by Feige et al.[2015] can be applied, and in the process extend their guarantees. Next, we extend our results to a multi-group setting and introduce a novel agnostic multi-robust learning problem where the goal is to learn a predictor that achieves low robust loss on a (potentially) rich collection of subgroups.
