Table of Contents
Fetching ...

Probably Approximately Global Robustness Certification

Peter Blohm, Patrick Indri, Thomas Gärtner, Sagar Malhotra

TL;DR

Experiments empirically confirm that the proposed probabilistic guarantees for the adversarial robustness of classification algorithms characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.

Abstract

We propose and investigate probabilistic guarantees for the adversarial robustness of classification algorithms. While traditional formal verification approaches for robustness are intractable and sampling-based approaches do not provide formal guarantees, our approach is able to efficiently certify a probabilistic relaxation of robustness. The key idea is to sample an $ε$-net and invoke a local robustness oracle on the sample. Remarkably, the size of the sample needed to achieve probably approximately global robustness guarantees is independent of the input dimensionality, the number of classes, and the learning algorithm itself. Our approach can, therefore, be applied even to large neural networks that are beyond the scope of traditional formal verification. Experiments empirically confirm that it characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.

Probably Approximately Global Robustness Certification

TL;DR

Experiments empirically confirm that the proposed probabilistic guarantees for the adversarial robustness of classification algorithms characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.

Abstract

We propose and investigate probabilistic guarantees for the adversarial robustness of classification algorithms. While traditional formal verification approaches for robustness are intractable and sampling-based approaches do not provide formal guarantees, our approach is able to efficiently certify a probabilistic relaxation of robustness. The key idea is to sample an -net and invoke a local robustness oracle on the sample. Remarkably, the size of the sample needed to achieve probably approximately global robustness guarantees is independent of the input dimensionality, the number of classes, and the learning algorithm itself. Our approach can, therefore, be applied even to large neural networks that are beyond the scope of traditional formal verification. Experiments empirically confirm that it characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.

Paper Structure

This paper contains 34 sections, 15 theorems, 65 equations, 6 figures, 2 tables, 1 algorithm.

Key Result

Theorem 3.6

Let $(\mathcal{Y},\mathcal{R})$ be a range space with VC dimension $d$ and let $\mathcal{D}_\mathcal{Y}$ be a probability distribution on $\mathcal{Y}$. For any $0< \delta,\epsilon\leq \frac{1}{2}$, an iid sample from $\mathcal{D}_\mathcal{Y}$ of size $s$ is an $\epsilon$-net for $\mathcal{Y}$ with

Figures (6)

  • Figure 1: Visualization of the quality space $\mathcal{Q}$ and of the region of counterexamples defined in \ref{['eq:counterexample_region']} for two possible pairs of robustness-confidence values $(\rho, \kappa$) and $(\tilde{\rho}, \tilde{\kappa})$.
  • Figure 2: Visualization of the robustness lower-bound map $M(\kappa)$. Each rectangular region under the curve has a probability mass smaller than $\epsilon$. The yellow line represents the maximum confidence value above which $M(\kappa)$ is undefined.
  • Figure 3: Scatter plot of the CIFAR-10 test dataset $D_\text{test}$, with $|D_\text{test}|=10000$, in the quality space $\mathcal{Q}$ with VGG11_BN. The left network is trained with standard methods, the right network is trained robustly with TRADES. The red line depicts the lower bound obtained from validation sample $N$, with the parameters $\epsilon=10^{-4}$, $\delta=p_{\min}=0.01$, with $|N|=s(\epsilon,\delta/2,2)=989534$. On the right-hand side, $5$ data points violate the lower bound. Note that, despite this apparent violation, $M$ tightly fits the test data and illustrates the contrasting robustness behaviors of the networks.
  • Figure 4: Scatter plot of the MNIST test dataset $D_\text{test}$, with $|D_\text{test}|=10000$, in the quality space $\mathcal{Q}$ for our feed forward network. The left network is trained with standard methods, the right network is trained robustly with TRADES. Results for parameters $\epsilon=10^{-4}$, $\delta=p_{\min}=0.01$, with $|N|=s(\epsilon,\delta/2,2)=989534$.
  • Figure 5: Scatter plot of the MNIST test dataset $D_\text{test}$, with $|D_\text{test}|=10000$, in the quality space $\mathcal{Q}$ for our feed forward network. The left network is trained with standard methods, the right network is trained robustly with TRADES. Results for parameters $\epsilon=2.5\cdot 10^{-3}$, $\delta=p_{\min}=0.01$, with $|N|=s(\epsilon,\delta/2,2)=31635$.
  • ...and 1 more figures

Theorems & Definitions (36)

  • Definition 3.1: Local $\rho$-robustness
  • Definition 3.2: Global $\rho$-$\kappa$-robustness
  • Definition 3.3: Range space
  • Definition 3.4: VC Dimension, Vapnik2015
  • Definition 3.5: $\epsilon$-net, haussler1986epsilon
  • Theorem 3.6: $\epsilon$-nets from iid samples mitzenmacher2017probability
  • Proposition 3.6: $\epsilon$-nets from iid samples with constants
  • proof
  • Definition 4.1
  • Proposition 4.2
  • ...and 26 more