Table of Contents
Fetching ...

Golyadkin's Torment: Doppelgängers and Adversarial Vulnerability

George I. Kamberov

TL;DR

It is found that AD are inputs that are close to each other with respect to a perceptual metric defined in this paper, and the notion of classifiers that exhibit hypersensitive behavior, that is, classifiers whose only mistakes are adversarial Doppelgangers is defined.

Abstract

Many machine learning (ML) classifiers are claimed to outperform humans, but they still make mistakes that humans do not. The most notorious examples of such mistakes are adversarial visual metamers. This paper aims to define and investigate the phenomenon of adversarial Doppelgangers (AD), which includes adversarial visual metamers, and to compare the performance and robustness of ML classifiers to human performance. We find that AD are inputs that are close to each other with respect to a perceptual metric defined in this paper. AD are qualitatively different from the usual adversarial examples. The vast majority of classifiers are vulnerable to AD and robustness-accuracy trade-offs may not improve them. Some classification problems may not admit any AD robust classifiers because the underlying classes are ambiguous. We provide criteria that can be used to determine whether a classification problem is well defined or not; describe the structure and attributes of an AD-robust classifier; introduce and explore the notions of conceptual entropy and regions of conceptual ambiguity for classifiers that are vulnerable to AD attacks, along with methods to bound the AD fooling rate of an attack. We define the notion of classifiers that exhibit hypersensitive behavior, that is, classifiers whose only mistakes are adversarial Doppelgangers. Improving the AD robustness of hyper-sensitive classifiers is equivalent to improving accuracy. We identify conditions guaranteeing that all classifiers with sufficiently high accuracy are hyper-sensitive. Our findings are aimed at significant improvements in the reliability and security of machine learning systems.

Golyadkin's Torment: Doppelgängers and Adversarial Vulnerability

TL;DR

It is found that AD are inputs that are close to each other with respect to a perceptual metric defined in this paper, and the notion of classifiers that exhibit hypersensitive behavior, that is, classifiers whose only mistakes are adversarial Doppelgangers is defined.

Abstract

Many machine learning (ML) classifiers are claimed to outperform humans, but they still make mistakes that humans do not. The most notorious examples of such mistakes are adversarial visual metamers. This paper aims to define and investigate the phenomenon of adversarial Doppelgangers (AD), which includes adversarial visual metamers, and to compare the performance and robustness of ML classifiers to human performance. We find that AD are inputs that are close to each other with respect to a perceptual metric defined in this paper. AD are qualitatively different from the usual adversarial examples. The vast majority of classifiers are vulnerable to AD and robustness-accuracy trade-offs may not improve them. Some classification problems may not admit any AD robust classifiers because the underlying classes are ambiguous. We provide criteria that can be used to determine whether a classification problem is well defined or not; describe the structure and attributes of an AD-robust classifier; introduce and explore the notions of conceptual entropy and regions of conceptual ambiguity for classifiers that are vulnerable to AD attacks, along with methods to bound the AD fooling rate of an attack. We define the notion of classifiers that exhibit hypersensitive behavior, that is, classifiers whose only mistakes are adversarial Doppelgangers. Improving the AD robustness of hyper-sensitive classifiers is equivalent to improving accuracy. We identify conditions guaranteeing that all classifiers with sufficiently high accuracy are hyper-sensitive. Our findings are aimed at significant improvements in the reliability and security of machine learning systems.

Paper Structure

This paper contains 28 sections, 4 theorems, 63 equations, 6 figures.

Key Result

Theorem 1

The theorem is attributed to Kalmar and Yakubovich, and is proven for reflexive and symmetric binary relations on finite sets in schreider1975equality. For a very simple proof of the general case, that does not rely on transfinite induction, see Subsection subs:schreider in the Appendix. Every symme

Figures (6)

  • Figure 1: Most people cannot discriminate image (a) from image (b). MobileNetV2 classifies the later image as "persian" and the former picture as "taby".
  • Figure 2: Applying a Fast Signed Gradient perturbation to the image (a) classified by MobileNetV2 as Labrador yields the image (b) which is classified by MobileNetV2 as Weimeraner.
  • Figure 3: Golyadkin's torment: "From the view point" of an input/stimulus $x\in \mathbf{X}$, the space $\mathbf{X}$ is stratified into concentric spheres, the nearest neighbors of $x$ are precisely its Doppelgängers some or all of which may be adversarial.
  • Figure 4: Linking $x\stackrel{\textgreek{ad}}{\approx} x_1\stackrel{\textgreek{ad}}{\approx} x_2\stackrel{\textgreek{ad}}{\approx}\cdots \stackrel{\textgreek{ad}}{\approx} x_*$ with $x_{*}\stackrel{\textgreek{ad}}{\approx} y$ to get the chain of Doppelgängers $x\stackrel{\textgreek{ad}}{\approx} x_1\stackrel{\textgreek{ad}}{\approx} x_2\stackrel{\textgreek{ad}}{\approx}\cdots \stackrel{\textgreek{ad}}{\approx} x_*\stackrel{\textgreek{ad}}{\approx} y$ from $x$ to $y$.
  • Figure 5: $X=\mathbb{R}$, $\mathop{\mathfrak{d}}(x)$ defined in (\ref{['eq:k line']}), $\Omega_1 = (-\infty, 0)$ and $\Omega_2 = [0, +\infty)$, the classifier $R(\varepsilon)$ defined in (\ref{['eq:sugit binary']}).
  • ...and 1 more figures

Theorems & Definitions (18)

  • Definition 1: williamson1990identity
  • Definition 2
  • Definition 3
  • Definition 4
  • Definition 5
  • Definition 6
  • Theorem 1
  • Definition 7
  • Definition 8
  • Definition 9
  • ...and 8 more