Table of Contents
Fetching ...

Integrating uncertainty quantification into randomized smoothing based robustness guarantees

Sina Däubener, Kira Maag, David Krueger, Asja Fischer

TL;DR

It is demonstrated, that the novel framework allows for a systematic robustness evaluation of different network architectures and uncertainty measures and to identify desired properties of uncertainty quantification techniques and it is shown that leveraging uncertainty in a smoothed classifier helps out-of-distribution detection.

Abstract

Deep neural networks have proven to be extremely powerful, however, they are also vulnerable to adversarial attacks which can cause hazardous incorrect predictions in safety-critical applications. Certified robustness via randomized smoothing gives a probabilistic guarantee that the smoothed classifier's predictions will not change within an $\ell_2$-ball around a given input. On the other hand (uncertainty) score-based rejection is a technique often applied in practice to defend models against adversarial attacks. In this work, we fuse these two approaches by integrating a classifier that abstains from predicting when uncertainty is high into the certified robustness framework. This allows us to derive two novel robustness guarantees for uncertainty aware classifiers, namely (i) the radius of an $\ell_2$-ball around the input in which the same label is predicted and uncertainty remains low and (ii) the $\ell_2$-radius of a ball in which the predictions will either not change or be uncertain. While the former provides robustness guarantees with respect to attacks aiming at increased uncertainty, the latter informs about the amount of input perturbation necessary to lead the uncertainty aware model into a wrong prediction. Notably, this is on CIFAR10 up to 20.93% larger than for models not allowing for uncertainty based rejection. We demonstrate, that the novel framework allows for a systematic robustness evaluation of different network architectures and uncertainty measures and to identify desired properties of uncertainty quantification techniques. Moreover, we show that leveraging uncertainty in a smoothed classifier helps out-of-distribution detection.

Integrating uncertainty quantification into randomized smoothing based robustness guarantees

TL;DR

It is demonstrated, that the novel framework allows for a systematic robustness evaluation of different network architectures and uncertainty measures and to identify desired properties of uncertainty quantification techniques and it is shown that leveraging uncertainty in a smoothed classifier helps out-of-distribution detection.

Abstract

Deep neural networks have proven to be extremely powerful, however, they are also vulnerable to adversarial attacks which can cause hazardous incorrect predictions in safety-critical applications. Certified robustness via randomized smoothing gives a probabilistic guarantee that the smoothed classifier's predictions will not change within an -ball around a given input. On the other hand (uncertainty) score-based rejection is a technique often applied in practice to defend models against adversarial attacks. In this work, we fuse these two approaches by integrating a classifier that abstains from predicting when uncertainty is high into the certified robustness framework. This allows us to derive two novel robustness guarantees for uncertainty aware classifiers, namely (i) the radius of an -ball around the input in which the same label is predicted and uncertainty remains low and (ii) the -radius of a ball in which the predictions will either not change or be uncertain. While the former provides robustness guarantees with respect to attacks aiming at increased uncertainty, the latter informs about the amount of input perturbation necessary to lead the uncertainty aware model into a wrong prediction. Notably, this is on CIFAR10 up to 20.93% larger than for models not allowing for uncertainty based rejection. We demonstrate, that the novel framework allows for a systematic robustness evaluation of different network architectures and uncertainty measures and to identify desired properties of uncertainty quantification techniques. Moreover, we show that leveraging uncertainty in a smoothed classifier helps out-of-distribution detection.

Paper Structure

This paper contains 29 sections, 4 theorems, 11 equations, 11 figures, 6 tables, 2 algorithms.

Key Result

Theorem 1

Let $\mathcal{Y}$ be a discrete and finite label space and $f:\mathbb{R}^d \rightarrow \mathcal{Y}$ be any deterministic or stochastic function. Let $\epsilon \sim \mathcal{N}(0, \sigma^2 I)$ and $g(x) = \arg \max_{c \in \mathcal{Y}} \mathbb{P}(f(x+ \epsilon) = c)$. Suppose $c_A \in \mathcal{Y}$ an Then $g(x+ \delta) = c_A \ \forall \ \delta \text{ with } \| \delta\|_2 < R$, where $R = \frac{\si

Figures (11)

  • Figure 1: Decision regions of a classifier $f$depicted in different colors, where the black dot marks the position of an input $x$ and the circles the different probability levels of a Gaussian distribution centered at $x$: (a) is the image recreated based on cohen_adv_robustness where the class with the most probability mass under Gaussian noise is predicted (here the blue class). (b) and (c) show different uncertainty distributions around the original decision boundaries (dotted) which lead to different uncertainty regions depicted in gray. While the uncertainty behavior around decision boundaries in (b) is almost symmetric, image (c) shows a strong asymmetric uncertainty behavior, making a confident misclassification difficult.
  • Figure 2: An illustration of why $R_{\text{CC}, \theta}$ can provide a stronger guarantee under appropriate conditions. The plot shows $\frac{1}{2}\cdot \Phi^{-1}(p)$ in dependence of $p$. ($\sqbullet$) denote values belonging to $p_{c_{A, sup}}$and $p_{c_{B, sup}}$, whereas ($\bigstar$) represents values w.r.t. $p_{c_{A, \theta}}$ and $\max_{c \neq c_{A, \theta}} p_{c}$. The bigger decrease of the runner-up class probability with uncertainty results in a higher $y$-axis difference which is here equal to the certified robustness radii.
  • Figure 3: Number of different assigned class labels to noisy versions of the same input for CIFAR10. The one-vs-all method of cohen_adv_robustness is suboptimal when examples have multiple neighboring classes; we observe this is typically the case when using $n_0=1,\!000$.
  • Figure 4: Our method only resorts to one-vs-all when we cannot confidently identify a runner-up class. Functions used are identical to the ones from cohen_adv_robustness.
  • Figure 5: Certified accuracy resulting from the changed certification procedure used for estimating the required bounds. Models were trained on CIFAR10. $R_{orig}$ corresponds to the one-vs-all method of cohen_adv_robustness and $R$ to our proposed scheme with calculating $\overline{p_B}$. A slight increase can be observed for all methods.
  • ...and 6 more figures

Theorems & Definitions (10)

  • Theorem 1: cohen_adv_robustness
  • Definition 1: Consistent predictions
  • Definition 2: Uncertainty function
  • Definition 3: Confident prediction
  • Definition 4: Uncertainty-equipped classifier
  • Corollary 1.1: Certified robustness with uncertainty
  • proof
  • Corollary 1.2
  • Proposition 2
  • proof