Table of Contents
Fetching ...

Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness

Ambar Pal, Jeremias Sulam, René Vidal

TL;DR

This work addresses whether adversarial examples are truly unavoidable by linking robustness to data distribution concentration. It proves a necessary condition: any robust classifier implies that at least one class-conditional distribution is concentrated on a tiny volume, and it introduces a stronger notion of concentration that is sufficient for robustness. Focusing on data near a union of low-dimensional linear subspaces, it develops a constructive, data-structure-aware certified defense that yields norm-independent polyhedral certificates via a dual optimization framework. Empirically, the method complements randomized smoothing on MNIST and can certify robustness beyond standard $\ell_p$ bounds, illustrating how exploiting data structure enhances both theory and practice of certified adversarial robustness. Overall, the paper highlights the central role of data geometry in robustness and offers a practical certification approach for structured distributions.

Abstract

The susceptibility of modern machine learning classifiers to adversarial examples has motivated theoretical results suggesting that these might be unavoidable. However, these results can be too general to be applicable to natural data distributions. Indeed, humans are quite robust for tasks involving vision. This apparent conflict motivates a deeper dive into the question: Are adversarial examples truly unavoidable? In this work, we theoretically demonstrate that a key property of the data distribution -- concentration on small-volume subsets of the input space -- determines whether a robust classifier exists. We further demonstrate that, for a data distribution concentrated on a union of low-dimensional linear subspaces, utilizing structure in data naturally leads to classifiers that enjoy data-dependent polyhedral robustness guarantees, improving upon methods for provable certification in certain regimes.

Adversarial Examples Might be Avoidable: The Role of Data Concentration in Adversarial Robustness

TL;DR

This work addresses whether adversarial examples are truly unavoidable by linking robustness to data distribution concentration. It proves a necessary condition: any robust classifier implies that at least one class-conditional distribution is concentrated on a tiny volume, and it introduces a stronger notion of concentration that is sufficient for robustness. Focusing on data near a union of low-dimensional linear subspaces, it develops a constructive, data-structure-aware certified defense that yields norm-independent polyhedral certificates via a dual optimization framework. Empirically, the method complements randomized smoothing on MNIST and can certify robustness beyond standard bounds, illustrating how exploiting data structure enhances both theory and practice of certified adversarial robustness. Overall, the paper highlights the central role of data geometry in robustness and offers a practical certification approach for structured distributions.

Abstract

The susceptibility of modern machine learning classifiers to adversarial examples has motivated theoretical results suggesting that these might be unavoidable. However, these results can be too general to be applicable to natural data distributions. Indeed, humans are quite robust for tasks involving vision. This apparent conflict motivates a deeper dive into the question: Are adversarial examples truly unavoidable? In this work, we theoretically demonstrate that a key property of the data distribution -- concentration on small-volume subsets of the input space -- determines whether a robust classifier exists. We further demonstrate that, for a data distribution concentrated on a union of low-dimensional linear subspaces, utilizing structure in data naturally leads to classifiers that enjoy data-dependent polyhedral robustness guarantees, improving upon methods for provable certification in certain regimes.
Paper Structure (30 sections, 8 theorems, 68 equations, 13 figures)

This paper contains 30 sections, 8 theorems, 68 equations, 13 figures.

Key Result

Theorem 2.1

If there exists an $(\epsilon, \delta)$-robust classifier $f$ for a data distribution $p$, then at least one of the class conditionals $q_1, q_2, \ldots, q_K$, say $q_{\bar{k}}$, must be $(\bar{C}, \epsilon, \delta)$--concentrated. Further, if the classes are balanced, then all the class conditional

Figures (13)

  • Figure 1: A plot of $q_1$. Redder colors denote a larger density, and the gray plane denotes the robust classifier.
  • Figure 2: A plot of $q_1$ (orange), $q_2$ (violet) and the decision boundaries of $f$ (dashed).
  • Figure 3: Geometry of the dual problem \ref{['eq:dualnoise']}. See description on the left.
  • Figure 4: Comparing polyhedral and spherical certificates. Details in text.
  • Figure 5: Comparing RS with Our method for adversarial perturbations computed by repeating Steps I, II \ref{['pgdmod']}.
  • ...and 8 more figures

Theorems & Definitions (19)

  • Definition 2.1: Robust Classifier
  • Definition 2.2: Concentrated Distribution
  • Theorem 2.1
  • proof : Proof Sketch
  • Definition 3.1: Strongly Concentrated Distributions
  • Theorem 3.1
  • Example 3.1
  • Example 4.1
  • Theorem 4.1
  • Lemma 4.2
  • ...and 9 more