Accelerated Smoothing: A Scalable Approach to Randomized Smoothing

Devansh Bhardwaj; Kshitiz Kaushik; Sarthak Gupta

Accelerated Smoothing: A Scalable Approach to Randomized Smoothing

Devansh Bhardwaj, Kshitiz Kaushik, Sarthak Gupta

TL;DR

Accelerated Smoothing replaces Monte Carlo sampling in randomized smoothing with a surrogate neural network trained to predict the MC class-count distribution, enabling near-equivalent certification with $O(1)$ inference time per sample. The surrogate is trained using Jensen-Shannon divergence to match the MC-derived counts $C = \frac{1}{N}\sum_{i=1}^N 1[f(x+\epsilon_i)=c]$, preserving model-agnostic applicability and smoothing distribution independence. On CIFAR-10, the method achieves robust-radius certifications with substantial speedups (approximately $600\times$ faster) compared to the standard MC-based approach, while maintaining competitive certified accuracy; baselines with very large $N$ still suffer for conventional MC. The work highlights practical scalability of certified defenses, discusses limitations in theoretical guarantees and training-time costs, and points to future directions such as uncertainty estimation and extension to other smoothing schemes.

Abstract

Randomized smoothing has emerged as a potent certifiable defense against adversarial attacks by employing smoothing noises from specific distributions to ensure the robustness of a smoothed classifier. However, the utilization of Monte Carlo sampling in this process introduces a compute-intensive element, which constrains the practicality of randomized smoothing on a larger scale. To address this limitation, we propose a novel approach that replaces Monte Carlo sampling with the training of a surrogate neural network. Through extensive experimentation in various settings, we demonstrate the efficacy of our approach in approximating the smoothed classifier with remarkable precision. Furthermore, we demonstrate that our approach significantly accelerates the robust radius certification process, providing nearly $600$X improvement in computation time, overcoming the computational bottlenecks associated with traditional randomized smoothing.

Accelerated Smoothing: A Scalable Approach to Randomized Smoothing

TL;DR

inference time per sample. The surrogate is trained using Jensen-Shannon divergence to match the MC-derived counts

, preserving model-agnostic applicability and smoothing distribution independence. On CIFAR-10, the method achieves robust-radius certifications with substantial speedups (approximately

faster) compared to the standard MC-based approach, while maintaining competitive certified accuracy; baselines with very large

still suffer for conventional MC. The work highlights practical scalability of certified defenses, discusses limitations in theoretical guarantees and training-time costs, and points to future directions such as uncertainty estimation and extension to other smoothing schemes.

Abstract

X improvement in computation time, overcoming the computational bottlenecks associated with traditional randomized smoothing.

Paper Structure (21 sections, 6 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 21 sections, 6 equations, 3 figures, 6 tables, 1 algorithm.

Introduction
Related Works
Preliminaries
Randomized Smoothing
Monte Carlo based algorithm for Evaluation
Methodology
Motivation
Key Idea
Certification via Accelerated Smoothing
Experiments
Implementation Details
Certification Results
Time Complexity Analysis
Error Analysis
Limitations and Future Works
...and 6 more sections

Figures (3)

Figure 1: Plot of Certification Time vs. No. of Samples. As can be seen, our methodology is $\mathcal{O}(1)$ whereas Neyman-Pearson-based sampling scales linearly, that is, $\mathcal{O}(N)$.
Figure 2: Left: Depicts the original Randomized smoothing methodology. Right: Illustrates our approach, known as Accelerated Smoothing. In the original methodology, a large number of noise samples are taken, resulting in increased computational cost. However, in our approach, we use a significantly lower value of $N$ for noise samples, solely for the purpose of ensuring that our prediction aligns with that of our surrogate model. The main distinction lies in the surrogate model itself. In the original methodology, the higher value of $N$ is employed to calculate class counts, whereas we utilize a surrogate model to predict the class count based on the input image, thereby significantly reducing the time complexity.
Figure 3: Comparison of the certified radius obtained using our method and cohen2019certified's methodology for two different sample sizes, N = 100 and N = 100000. The top section displays the results for $\sigma = 0.25$, while the bottom section displays the results for $\sigma = 0.5$. It can be observed that our methodology allows us to accurately estimate the results for N = 100000.

Accelerated Smoothing: A Scalable Approach to Randomized Smoothing

TL;DR

Abstract

Accelerated Smoothing: A Scalable Approach to Randomized Smoothing

Authors

TL;DR

Abstract

Table of Contents

Figures (3)