Uniform Convergence of Adversarially Robust Classifiers

Rachel Morris; Ryan Murray

Uniform Convergence of Adversarially Robust Classifiers

Rachel Morris, Ryan Murray

TL;DR

It is demonstrated that as adversarial strength goes to zero that optimal classifiers converge to the Bayes classifier in the Hausdorff distance, significantly strengthens previous results, which generally focus on L^1$ -type convergence.

Abstract

In recent years there has been significant interest in the effect of different types of adversarial perturbations in data classification problems. Many of these models incorporate the adversarial power, which is an important parameter with an associated trade-off between accuracy and robustness. This work considers a general framework for adversarially-perturbed classification problems, in a large data or population-level limit. In such a regime, we demonstrate that as adversarial strength goes to zero that optimal classifiers converge to the Bayes classifier in the Hausdorff distance. This significantly strengthens previous results, which generally focus on $L^1$-type convergence. The main argument relies upon direct geometric comparisons and is inspired by techniques from geometric measure theory.

Uniform Convergence of Adversarially Robust Classifiers

TL;DR

Abstract

-type convergence. The main argument relies upon direct geometric comparisons and is inspired by techniques from geometric measure theory.

Paper Structure (11 sections, 29 theorems, 126 equations, 3 figures, 1 table)

This paper contains 11 sections, 29 theorems, 126 equations, 3 figures, 1 table.

Introduction
Setup
Informal Main Results and Discussion
Energy Exchange Inequality
Uniform Convergence for the Adversarial Training Problem
Uniform Convergence for Other Deterministic Attacks
Application to the Probabilistic Adversarial Training Problem
Conclusion
Appendix
The $U$ Sets for $\phi_\varepsilon$
$\Lambda$-Set Decompositions

Key Result

Theorem 1

Under the conditions of Theorems 2.1 and 2.3 from bungert2024gammaconv and assuming the source condition, any sequence of solutions to possesses a subsequence converging to a minimizer of

Figures (3)

Figure 1: This diagram illustrates the sets present in the energy exchange inequality for the adversarial training problem \ref{['eqn:ATP']} when $E = B_\mathrm{d}(R)$. The sets comprising $\varepsilon\mathop{\mathrm{Per}}\nolimits_{\varepsilon}(A;B_\mathrm{d}(R))$ are shaded blue and purple whereas the sets comprising $\varepsilon\mathop{\mathrm{Per}}\nolimits_{\varepsilon}(B_\mathrm{d}(R)^\mathsf{c};A)$ are shaded pink and purple.
Figure 2: This diagram depicts the $U_i$ regions for the attack function $\phi_\varepsilon$ associated with adversarial training problem \ref{['eqn:ATP']}. The $\varepsilon$-perimeter regions of $A$ are shaded blue and purple whereas $\varepsilon$-perimeter regions of $A\setminus B_\mathrm{d}(R)$ are shaded pink and purple. Note that some sets, such as $\widehat{U}_{1}$, are null sets for the $\varepsilon$-perimeter, and so do not appear in this figure.
Figure 3: A degenerate example where $U_6$ and $U_9$ are neither solely attacked nor unattacked sets. The example arises because the boundaries of $A$ and $B_{\mathrm{d}}(R)$ coincide. The pink and purple sets represent the $\varepsilon$-perimeter regions of $A$ whereas the blue and purple regions represent the $\varepsilon$-perimeter regions for $A\setminus \overline{B_\mathrm{d}(R)}$.

Theorems & Definitions (74)

Remark 1.1: Uniqueness of Bayes Classifiers
Remark 1.2: Previous work for the adversarial training problem \ref{['eqn:ATP']}
Theorem : Conditional convergence of adversarial training
Remark 1.4: Previous work for the probabilistic adversarial training problem \ref{['PATP']}
Definition 1.5
Definition 1.6
Definition 1.7
Definition 1.8
Definition 1.9
Remark 1.10
...and 64 more

Uniform Convergence of Adversarially Robust Classifiers

TL;DR

Abstract

Uniform Convergence of Adversarially Robust Classifiers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (74)