Table of Contents
Fetching ...

The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing

Blaise Delattre, Alexandre Araujo, Quentin Barthélemy, Alexandre Allauzen

TL;DR

This work emphasizes the dual impact of the Lipschitz constant of the base classifier, on both the smoothed classifier and the empirical variance, and introduces a different way to convert logits to probability vectors for the base classifier to leverage the variance-margin trade-off.

Abstract

Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks. The certified radius in this context is a crucial indicator of the robustness of models. However how to design an efficient classifier with an associated certified radius? Randomized smoothing provides a promising framework by relying on noise injection into the inputs to obtain a smoothed and robust classifier. In this paper, we first show that the variance introduced by the Monte-Carlo sampling in the randomized smoothing procedure estimate closely interacts with two other important properties of the classifier, \textit{i.e.} its Lipschitz constant and margin. More precisely, our work emphasizes the dual impact of the Lipschitz constant of the base classifier, on both the smoothed classifier and the empirical variance. To increase the certified robust radius, we introduce a different way to convert logits to probability vectors for the base classifier to leverage the variance-margin trade-off. We leverage the use of Bernstein's concentration inequality along with enhanced Lipschitz bounds for randomized smoothing. Experimental results show a significant improvement in certified accuracy compared to current state-of-the-art methods. Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.

The Lipschitz-Variance-Margin Tradeoff for Enhanced Randomized Smoothing

TL;DR

This work emphasizes the dual impact of the Lipschitz constant of the base classifier, on both the smoothed classifier and the empirical variance, and introduces a different way to convert logits to probability vectors for the base classifier to leverage the variance-margin trade-off.

Abstract

Real-life applications of deep neural networks are hindered by their unsteady predictions when faced with noisy inputs and adversarial attacks. The certified radius in this context is a crucial indicator of the robustness of models. However how to design an efficient classifier with an associated certified radius? Randomized smoothing provides a promising framework by relying on noise injection into the inputs to obtain a smoothed and robust classifier. In this paper, we first show that the variance introduced by the Monte-Carlo sampling in the randomized smoothing procedure estimate closely interacts with two other important properties of the classifier, \textit{i.e.} its Lipschitz constant and margin. More precisely, our work emphasizes the dual impact of the Lipschitz constant of the base classifier, on both the smoothed classifier and the empirical variance. To increase the certified robust radius, we introduce a different way to convert logits to probability vectors for the base classifier to leverage the variance-margin trade-off. We leverage the use of Bernstein's concentration inequality along with enhanced Lipschitz bounds for randomized smoothing. Experimental results show a significant improvement in certified accuracy compared to current state-of-the-art methods. Our novel certification procedure allows us to use pre-trained models with randomized smoothing, effectively improving the current certification radius in a zero-shot manner.
Paper Structure (28 sections, 15 theorems, 64 equations, 6 figures, 11 tables, 2 algorithms)

This paper contains 28 sections, 15 theorems, 64 equations, 6 figures, 11 tables, 2 algorithms.

Key Result

Proposition 1

Given a Lipschitz continuous subclassifier $f$ for the $\ell_2$-norm, and given a perturbation level $\varepsilon > 0$, $x \in \mathcal{X}$, and $y \in \mathcal{Y}$ as the label of $x$. If the margin $M(f(x),y)$ at input $x$ meets the condition $M(f( x), y) > \sqrt{2} L(f) \varepsilon$, then for eve

Figures (6)

  • Figure 1: First, tsuzuku_lipschitz-margin_2018 proposes a deterministic certificate starting from a Lipschitz base subclassifier, followed by margin calculation and radius binding. Second, cohen_certified_2019 introduces a base subclassifier to create a smoothed subclassifier. The risk factor $\alpha$ is then estimated using the Clopper-Pearson interval to provide a probabilistic certificate. Third, our method (the Lipschitz-Variance-Margin Randomized Smoothing or LVM-RS) extends a smoothed classifier constructed with a Lipschitz base classifier composed with a map which transforms logit to probability vector in simplex. The regularization of the Lipschitz constant is motivated by the Gaussian-Poincaré inequality in Theorem \ref{['prop:gaussian_poincarre_inequality']}. The empirical variance is applied to the Empirical Bernstein inequality in Proposition \ref{['prop:empirical_bernstein_inequality']} to accommodate for the risk factor $\alpha$, in the same flavor as in levine_certifiably_2019. The pipeline also ends with a probabilistic certificate, similar to the methodology used in cohen_certified_2019's certified approach.
  • Figure 2: Comparison between corrected certified radii $R_2(\bar{p})$ produced by Bernstein's and Hoeffding's inequalities, for a random subset of $1000$ images of ImageNet dataset using RS with a smoothing noise $\sigma=1.0$. We use the ViT-denoiser baseline from carlini_certified_2023.
  • Figure 3: Comparison of the effect on corrected certified radii $R_2(\bar{p})$ of the choice of the simplex map $s$ and associated temperature $t$. Simplex maps considered are $s \in \{\mathrm{sparsemax}, \mathrm{softmax}, \mathrm{hardmax}\}$. The base subclassifier is the one from carlini_certified_2023 and the corrected certified radii were generated with one image from ImageNet with smoothing variance $\sigma=1.0$. Radii are risk corrected with Empirical Bernstein inequality for a risk $\alpha=1\mathrm{e-}3$ and $n=10^4$. We see that by varying the temperature $t$, $\mathrm{softmax}$ and $\mathrm{sparsemax}$ can find a better solution than $\mathrm{hardmax}$ to the variance-margin trade-off.
  • Figure 4: Certified accuracies ($CA$ in $\%$) with $R_1$ in function of levels of perturbation $r$ on CIFAR-10, for different simplex mass $r$. Number of samples is $n=10^4$ and risk $\alpha = 1e\text{-}3$. The case $r=1.0$ corresponds to the regular RS setting.
  • Figure 5: Certified accuracies ($CA$ in $\%$) in function of level of perturbations $\epsilon$ on CIFAR-10, for different noise levels $\sigma=\{0.25, 0.5, 1\}$. Number of samples is $n=10^5$ and risk $\alpha = 1e\text{-}3$. Our method is compared to the baseline chosen as in carlini_certified_2023.
  • ...and 1 more figures

Theorems & Definitions (25)

  • Proposition 1: tsuzuku_lipschitz-margin_2018
  • Theorem 1: Gaussian Poincaré inequality boucheron_concentration_2013
  • Corollary 1
  • Proposition 2: Empirical Bernstein's inequality maurer_empirical_2009
  • Theorem 2
  • Proposition 3
  • Theorem 3
  • Example
  • proof
  • Lemma 1
  • ...and 15 more