Table of Contents
Fetching ...

Dual Randomized Smoothing: Beyond Global Noise Variance

Chenhao Sun, Yuhao Mao, Martin Vechev

TL;DR

Dual Randomized Smoothing (Dual RS) addresses the global noise-variance limitation in Randomized Smoothing by enabling input-dependent, locally constant variances. It formalizes RS certification under locally constant σ, and introduces a two-component framework: a variance estimator that predicts a per-input σ_c and a diffusion-based RS classifier that uses σ_c, trained with soft labels and consistency regularization. An iterative training scheme plus a routing view allows exploiting expert RS models specialized for different radii, yielding strong accuracy-robustness trade-offs on CIFAR-10 and ImageNet with modest inference overhead. Empirical results show notable improvements across small and large radii, and the approach scales to large datasets while offering a principled mechanism to combine multiple certified models. The work also provides reproducibility resources, including proofs, training/inference protocols, and code.

Abstract

Randomized Smoothing (RS) is a prominent technique for certifying the robustness of neural networks against adversarial perturbations. With RS, achieving high accuracy at small radii requires a small noise variance, while achieving high accuracy at large radii requires a large noise variance. However, the global noise variance used in the standard RS formulation leads to a fundamental limitation: there exists no global noise variance that simultaneously achieves strong performance at both small and large radii. To break through the global variance limitation, we propose a dual RS framework which enables input-dependent noise variances. To achieve that, we first prove that RS remains valid with input-dependent noise variances, provided the variance is locally constant around each input. Building on this result, we introduce two components which form our dual RS framework: (i) a variance estimator first predicts an optimal noise variance for each input, (ii) this estimated variance is then used by a standard RS classifier. The variance estimator is independently smoothed via RS to ensure local constancy, enabling flexible design. We also introduce training strategies to iteratively optimize the two components. Extensive experiments on CIFAR-10 show that our dual RS method provides strong performance for both small and large radii-unattainable with global noise variance-while incurring only a 60% computational overhead at inference. Moreover, it consistently outperforms prior input-dependent noise approaches across most radii, with particularly large gains at radii 0.5, 0.75, and 1.0, achieving relative improvements of 19%, 24%, and 21%, respectively. On ImageNet, dual RS remains effective across all radii. Additionally, the dual RS framework naturally provides a routing perspective for certified robustness, improving the accuracy-robustness trade-off with off-the-shelf expert RS models.

Dual Randomized Smoothing: Beyond Global Noise Variance

TL;DR

Dual Randomized Smoothing (Dual RS) addresses the global noise-variance limitation in Randomized Smoothing by enabling input-dependent, locally constant variances. It formalizes RS certification under locally constant σ, and introduces a two-component framework: a variance estimator that predicts a per-input σ_c and a diffusion-based RS classifier that uses σ_c, trained with soft labels and consistency regularization. An iterative training scheme plus a routing view allows exploiting expert RS models specialized for different radii, yielding strong accuracy-robustness trade-offs on CIFAR-10 and ImageNet with modest inference overhead. Empirical results show notable improvements across small and large radii, and the approach scales to large datasets while offering a principled mechanism to combine multiple certified models. The work also provides reproducibility resources, including proofs, training/inference protocols, and code.

Abstract

Randomized Smoothing (RS) is a prominent technique for certifying the robustness of neural networks against adversarial perturbations. With RS, achieving high accuracy at small radii requires a small noise variance, while achieving high accuracy at large radii requires a large noise variance. However, the global noise variance used in the standard RS formulation leads to a fundamental limitation: there exists no global noise variance that simultaneously achieves strong performance at both small and large radii. To break through the global variance limitation, we propose a dual RS framework which enables input-dependent noise variances. To achieve that, we first prove that RS remains valid with input-dependent noise variances, provided the variance is locally constant around each input. Building on this result, we introduce two components which form our dual RS framework: (i) a variance estimator first predicts an optimal noise variance for each input, (ii) this estimated variance is then used by a standard RS classifier. The variance estimator is independently smoothed via RS to ensure local constancy, enabling flexible design. We also introduce training strategies to iteratively optimize the two components. Extensive experiments on CIFAR-10 show that our dual RS method provides strong performance for both small and large radii-unattainable with global noise variance-while incurring only a 60% computational overhead at inference. Moreover, it consistently outperforms prior input-dependent noise approaches across most radii, with particularly large gains at radii 0.5, 0.75, and 1.0, achieving relative improvements of 19%, 24%, and 21%, respectively. On ImageNet, dual RS remains effective across all radii. Additionally, the dual RS framework naturally provides a routing perspective for certified robustness, improving the accuracy-robustness trade-off with off-the-shelf expert RS models.

Paper Structure

This paper contains 39 sections, 6 theorems, 8 equations, 11 figures, 6 tables.

Key Result

Theorem 4.1

Fix ${\bm{x}}_0 \in {\mathcal{X}}$ and $f_c$. Assume $\sigma({\bm{x}})$ is constant within the $\ell_2$ ball ${\mathbb{B}}({\bm{x}}_0, R_\sigma)$. Then for all ${\bm{x}}$ such that $\|{\bm{x}} - {\bm{x}}_0\|_2 \leq \min(R_\sigma, R({\bm{x}}, \sigma({\bm{x}}_0)))$, we have $g_c({\bm{x}}, \sigma({\bm{

Figures (11)

  • Figure 1: Left: The distribution of the optimal $\sigma$ on CIFAR-10 test set, where the base model is fixed to the pretrained denoised smoothing model from carlinicertified. The optimal $\sigma$ for each input is defined as the $\sigma$ that maximizes the certified radius under the standard RS certification. Right: The certified radii curve of five independent samples against $\sigma$.
  • Figure 2: The dual RS framework. First, a RS model $g_e$ smoothed with a global $\sigma_e$ is deployed to estimate $\sigma_c({\bm{x}})$ and return a certified radius for the estimation, $R_{\sigma}$. Second, another RS model is smoothed with $\sigma_c({\bm{x}})$, and then perform a standard classification and return a certified radius for the classification, $R_c$. The final prediction is the result of the second stage, with a final certified radius $R_{\text{final}}=\min(R_{\sigma}, R_c)$. The green arrows indicate activated paths during inference.
  • Figure 3: Certified accuracy on CIFAR-10 across radii.
  • Figure 4: Comparison between dual RS built on weak and strong experts, respectively, along with the experts.
  • Figure 5: Comparison of dual RS models with different variance estimators.
  • ...and 6 more figures

Theorems & Definitions (9)

  • Theorem 4.1: Certification with Locally Constant $\sigma$
  • Theorem 4.2: Probabilistic Guarantee with Confidence Adjustment
  • Lemma B.0
  • Lemma B.0
  • proof
  • Theorem B.1: Certification with Locally Constant $\sigma$
  • proof
  • Theorem B.1: Probabilistic Guarantee with Confidence Adjustment
  • proof