Table of Contents
Fetching ...

Provable tradeoffs in adversarially robust classification

Edgar Dobriban, Hamed Hassani, David Hong, Alexander Robey

TL;DR

This paper derives exact and approximate Bayes-optimal robust classifiers for the important setting of two- and three-class Gaussian classification problems with arbitrary imbalance, and reveals fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.

Abstract

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs. Despite significant progress in the area, foundational open problems remain. In this paper, we address several key questions. We derive exact and approximate Bayes-optimal robust classifiers for the important setting of two- and three-class Gaussian classification problems with arbitrary imbalance, for $\ell_2$ and $\ell_\infty$ adversaries. In contrast to classical Bayes-optimal classifiers, determining the optimal decisions here cannot be made pointwise and new theoretical approaches are needed. We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry, which, to our knowledge, have not yet been used in the area. Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced. We also show further results, including an analysis of classification calibration for convex losses in certain models, and finite sample rates for the robust risk.

Provable tradeoffs in adversarially robust classification

TL;DR

This paper derives exact and approximate Bayes-optimal robust classifiers for the important setting of two- and three-class Gaussian classification problems with arbitrary imbalance, and reveals fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.

Abstract

It is well known that machine learning methods can be vulnerable to adversarially-chosen perturbations of their inputs. Despite significant progress in the area, foundational open problems remain. In this paper, we address several key questions. We derive exact and approximate Bayes-optimal robust classifiers for the important setting of two- and three-class Gaussian classification problems with arbitrary imbalance, for and adversaries. In contrast to classical Bayes-optimal classifiers, determining the optimal decisions here cannot be made pointwise and new theoretical approaches are needed. We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry, which, to our knowledge, have not yet been used in the area. Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced. We also show further results, including an analysis of classification calibration for convex losses in certain models, and finite sample rates for the robust risk.

Paper Structure

This paper contains 39 sections, 18 theorems, 181 equations, 5 figures.

Key Result

Theorem 4.1

Suppose the data $(x,y)$ follow the two-class Gaussian model eq:model:twoclass and $\varepsilon < \|\mu\|_2$. An optimal $\ell_2$ robust classifier is where $q = \ln\{(1-\pi)/\pi\}$ and $(x)_+ = \max(x,0)$. Moreover, the corresponding optimal robust risk is where $R_{\mathrm{Bay}}$ is the Bayes risk defined in eq:twoclass:bayes.

Figures (5)

  • Figure 1: Illustration of differences between the standard and robust risk. The Bayes classifier $\hty^*_{\mathrm{Bay}}$ minimizes the standard risk by maximizing $\Pr(y=c) \cdot p_{x|y=c}(x)$ for each $x$ pointwise, so it assigns a nontrivial interval around $x=0$ to the zero class. However, it has worse robust risk than an alternative $\hty$ that drops the zero class. Minimizing the robust risk does not reduce to making optimal pointwise decisions.
  • Figure 2: Tradeoffs between optimal classification with respect to standard and robust risks.
  • Figure 3: Optimal linear interval $\ell_2$-robust classifiers for three classes; (\ref{['fig:robopt:threeclass:pi0:a']}) and (\ref{['fig:robopt:threeclass:pi0:b']}) show the thresholds \ref{['eq:robopt:threeclass:1:thresh', 'eq:robopt:threeclass:2:thresh']}, circling the optimal, where $\mu = 1$, $\lambda_\pm = \pm 1$, $\gamma = 1.2$, $\varepsilon = 0.4$ and $\pi_\pm = (1-\pi_0)\{\gamma^{\pm 1}/(\gamma+\gamma^{-1})\}$.
  • Figure 4: Mean gap between robust and standard risks of optimal finite-sample $\ell_\infty$ robust classifiers obtained via empirical robust risk minimization. Here we set the dimension $p = 5$, mean vector $\mu = 1/2\cdot \mathbbm{1}$, and class proportion $\pi = 1/2$.
  • Figure 5: Trade-off between (population) standard and robust risk for $\varepsilon\in[0,1]$ for classifiers obtained via Prop \ref{['prop:empirical:risk:opt:linloss']}. Here we set $p = 5$, $\mu = 1/2\cdot \mathbbm{1}$, $\pi = 1/2$.

Theorems & Definitions (33)

  • Theorem 4.1: Optimal $\ell_2$-robust two-class classifiers
  • Lemma 4.2: Existence of optimal linear classifiers
  • proof : Proof of \ref{['thm:opt:twoclass:admissible']}
  • Theorem 4.3: Approximately optimal robust classifiers
  • proof : Proof of \ref{['thm:approx:opt:l2:twoclass']}
  • Theorem 5.1: Optimal linear interval $\ell_2$-robust three-class classifiers
  • Conjecture 5.2: Linear interval classifiers are optimal across all classifiers
  • Definition 5.3: Ignore/separate classifiers
  • Theorem 5.4: Linear interval classifiers are optimal across ignore/separate classifiers
  • Lemma 5.5: Linear interval classifiers dominate ignore/separate classifiers
  • ...and 23 more