Table of Contents
Fetching ...

PAC-learning in the presence of evasion adversaries

Daniel Cullina, Arjun Nitin Bhagoji, Prateek Mittal

TL;DR

This work generalizes PAC-learning to settings with evasion adversaries by introducing corrupted hypotheses and the adversarial VC-dimension (AVC). It proves sample-complexity bounds that scale with AVC, and computes AVC explicitly for halfspace classifiers under convex, norm-bound constraints, showing it matches the standard VC-dimension in common cases. The paper also demonstrates that AVC can be larger or smaller than the standard VC-dimension, depending on the hypothesis class and adversary, highlighting the nuanced impact of adversaries on learnability. Overall, it provides a rigorous, distribution-agnostic theoretical foundation for understanding learning with evasion adversaries and guides future robust learning analysis.

Abstract

The existence of evasion attacks during the test phase of machine learning algorithms represents a significant challenge to both their deployment and understanding. These attacks can be carried out by adding imperceptible perturbations to inputs to generate adversarial examples and finding effective defenses and detectors has proven to be difficult. In this paper, we step away from the attack-defense arms race and seek to understand the limits of what can be learned in the presence of an evasion adversary. In particular, we extend the Probably Approximately Correct (PAC)-learning framework to account for the presence of an adversary. We first define corrupted hypothesis classes which arise from standard binary hypothesis classes in the presence of an evasion adversary and derive the Vapnik-Chervonenkis (VC)-dimension for these, denoted as the adversarial VC-dimension. We then show that sample complexity upper bounds from the Fundamental Theorem of Statistical learning can be extended to the case of evasion adversaries, where the sample complexity is controlled by the adversarial VC-dimension. We then explicitly derive the adversarial VC-dimension for halfspace classifiers in the presence of a sample-wise norm-constrained adversary of the type commonly studied for evasion attacks and show that it is the same as the standard VC-dimension, closing an open question. Finally, we prove that the adversarial VC-dimension can be either larger or smaller than the standard VC-dimension depending on the hypothesis class and adversary, making it an interesting object of study in its own right.

PAC-learning in the presence of evasion adversaries

TL;DR

This work generalizes PAC-learning to settings with evasion adversaries by introducing corrupted hypotheses and the adversarial VC-dimension (AVC). It proves sample-complexity bounds that scale with AVC, and computes AVC explicitly for halfspace classifiers under convex, norm-bound constraints, showing it matches the standard VC-dimension in common cases. The paper also demonstrates that AVC can be larger or smaller than the standard VC-dimension, depending on the hypothesis class and adversary, highlighting the nuanced impact of adversaries on learnability. Overall, it provides a rigorous, distribution-agnostic theoretical foundation for understanding learning with evasion adversaries and guides future robust learning analysis.

Abstract

The existence of evasion attacks during the test phase of machine learning algorithms represents a significant challenge to both their deployment and understanding. These attacks can be carried out by adding imperceptible perturbations to inputs to generate adversarial examples and finding effective defenses and detectors has proven to be difficult. In this paper, we step away from the attack-defense arms race and seek to understand the limits of what can be learned in the presence of an evasion adversary. In particular, we extend the Probably Approximately Correct (PAC)-learning framework to account for the presence of an adversary. We first define corrupted hypothesis classes which arise from standard binary hypothesis classes in the presence of an evasion adversary and derive the Vapnik-Chervonenkis (VC)-dimension for these, denoted as the adversarial VC-dimension. We then show that sample complexity upper bounds from the Fundamental Theorem of Statistical learning can be extended to the case of evasion adversaries, where the sample complexity is controlled by the adversarial VC-dimension. We then explicitly derive the adversarial VC-dimension for halfspace classifiers in the presence of a sample-wise norm-constrained adversary of the type commonly studied for evasion attacks and show that it is the same as the standard VC-dimension, closing an open question. Finally, we prove that the adversarial VC-dimension can be either larger or smaller than the standard VC-dimension depending on the hypothesis class and adversary, making it an interesting object of study in its own right.

Paper Structure

This paper contains 10 sections, 7 theorems, 26 equations, 2 figures, 1 table.

Key Result

Lemma 1

Let $\operatorname{A} : (\mathcal{X} \times \mathcal{C})^n \to (\mathcal{X} \to \mathcal{C})$ be learning algorithm for a hypothesis class $\mathcal{H}$. Suppose $R_1,R_2$ are nearness relations and $R_1 \subseteq R_2$. For all $P$, For all $P$ and all $(\mathbf{x},\mathbf{c})$,

Figures (2)

  • Figure 1: Combining the family of hypotheses with the nearness relation $R$. The top figure depicts some $h \in \mathcal{H}$ and the bottom shows $\kappa_R(h) \in \widetilde{\mathcal{H}}$.
  • Figure 2: The examples $x_0 = (-1,1)$ and $x_1 = (1,-1)$ are marked with crosses. The function $h_{(0,1)} \in \mathcal{H}$ maps the smaller square to $1$ and everything else to $-1$. The degraded function $\tilde{h}_{(0,1)} \in \widetilde{\mathcal{H}}$ maps the larger square to $\bot$ and everything else to $-1$. Observe that $\tilde{h}_{(0,1)}(x_0) = \bot$ and $\tilde{h}_{(0,1)}(x_1) = -1$.

Theorems & Definitions (20)

  • Definition 1: Adversarial Expected Risk
  • Definition 2: Adversarial Empirical Risk Minimization (ERM)
  • Lemma 1
  • proof
  • Definition 3: Learnability
  • Lemma 2
  • proof
  • Lemma 3: shalev-shwartz_understanding_2014 Theorem 26.5
  • Definition 4: Equivalent shattering coefficient definitions
  • Definition 5: Adversarial VC-dimension
  • ...and 10 more