Your Diffusion Model is Secretly a Certifiably Robust Classifier

Huanran Chen; Yinpeng Dong; Shitong Shao; Zhongkai Hao; Xiao Yang; Hang Su; Jun Zhu

Your Diffusion Model is Secretly a Certifiably Robust Classifier

Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu

TL;DR

This paper studies the provable robustness of diffusion-based classifiers and shows they enjoy an $O(1)$ Lipschitz constant. It introduces Noised Diffusion Classifiers, including Exact Posterior (EPNDC) and Approximated Posterior (APNDC) variants, to classify Gaussian-corrupted data via ELBO-based log-likelihoods and Bayes' theorem, further enhanced by randomized smoothing to tighten certified radii. APNDC, acting as an ensemble of EPNDC with minimal overhead, achieves state-of-the-art or competitive certified robustness on CIFAR-10 and ImageNet64x64 using a single pre-trained diffusion model without extra data, while two variance-reduction strategies dramatically cut time complexity. The work also derives a rigorous Lipschitz bound for diffusion classifiers and introduces efficient class-selection strategies (Sift-and-Refine) to scale to large class counts, highlighting both theoretical and practical advances in robust diffusion-based classification.

Abstract

Generative learning, recognized for its effective modeling of data distributions, offers inherent advantages in handling out-of-distribution instances, especially for enhancing robustness to adversarial attacks. Among these, diffusion classifiers, utilizing powerful diffusion models, have demonstrated superior empirical robustness. However, a comprehensive theoretical understanding of their robustness is still lacking, raising concerns about their vulnerability to stronger future attacks. In this study, we prove that diffusion classifiers possess $O(1)$ Lipschitzness, and establish their certified robustness, demonstrating their inherent resilience. To achieve non-constant Lipschitzness, thereby obtaining much tighter certified robustness, we generalize diffusion classifiers to classify Gaussian-corrupted data. This involves deriving the evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. Experimental results show the superior certified robustness of these Noised Diffusion Classifiers (NDCs). Notably, we achieve over 80% and 70% certified robustness on CIFAR-10 under adversarial perturbations with $\ell_2$ norms less than 0.25 and 0.5, respectively, using a single off-the-shelf diffusion model without any additional data.

Your Diffusion Model is Secretly a Certifiably Robust Classifier

TL;DR

This paper studies the provable robustness of diffusion-based classifiers and shows they enjoy an

Lipschitz constant. It introduces Noised Diffusion Classifiers, including Exact Posterior (EPNDC) and Approximated Posterior (APNDC) variants, to classify Gaussian-corrupted data via ELBO-based log-likelihoods and Bayes' theorem, further enhanced by randomized smoothing to tighten certified radii. APNDC, acting as an ensemble of EPNDC with minimal overhead, achieves state-of-the-art or competitive certified robustness on CIFAR-10 and ImageNet64x64 using a single pre-trained diffusion model without extra data, while two variance-reduction strategies dramatically cut time complexity. The work also derives a rigorous Lipschitz bound for diffusion classifiers and introduces efficient class-selection strategies (Sift-and-Refine) to scale to large class counts, highlighting both theoretical and practical advances in robust diffusion-based classification.

Abstract

Lipschitzness, and establish their certified robustness, demonstrating their inherent resilience. To achieve non-constant Lipschitzness, thereby obtaining much tighter certified robustness, we generalize diffusion classifiers to classify Gaussian-corrupted data. This involves deriving the evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. Experimental results show the superior certified robustness of these Noised Diffusion Classifiers (NDCs). Notably, we achieve over 80% and 70% certified robustness on CIFAR-10 under adversarial perturbations with

norms less than 0.25 and 0.5, respectively, using a single off-the-shelf diffusion model without any additional data.

Paper Structure (35 sections, 4 theorems, 51 equations, 2 figures, 6 tables, 6 algorithms)

This paper contains 35 sections, 4 theorems, 51 equations, 2 figures, 6 tables, 6 algorithms.

Introduction
Background
Diffusion Models
Diffusion Classifiers
Randomized Smoothing
Methodology
The Lipschitzness of Diffusion Classifiers
Exact Posterior Noised Diffusion Classifier
Approximated Posterior Noised Diffusion Classifier
Time Complexity Reduction
Experiment
Results on CIFAR-10
Results on ImageNet
Discussions
Conclusion
...and 20 more sections

Key Result

Theorem 3.1

The upper bound of Lipschitz constant of diffusion classifier is given by: If one can get a lower bound $\underline{p_A}$ for $\text{DC}(\boldsymbol{\mathbf{x}}_0)_y$ and a upper bound $\overline{p_B}$ for $\max_{\hat{y} \neq y}\text{DC}(\boldsymbol{\mathbf{x}}_0)_{\hat{y}}$ (e.g., probabilistic bound by Bernstein inequality maurer2009empiricalbernstein), the lower bound o

Figures (2)

Figure 1: Illustration of our theoretical contributions. We derive the Lipschitz constant and the corresponding certified radius for diffusion classifiers chen2023robust. Additionally, we introduce two novel evidence lower bounds, which are used to approximate the log likelihood. These lower bounds are then employed to construct classifiers based on Bayes' theorem. By applying randomized smoothing to these classifiers, we derive their certified robust radii.
Figure 2: (a) The accuracy (%) on CIFAR-10 dataset with time complexity reduction technique in chen2023robust and ours. (b, c) The upper envelop of certified radii of different methods.

Theorems & Definitions (10)

Theorem 3.1
proof
Theorem 3.2
Remark 3.3
Remark 3.4
Remark 3.5
Lemma A.2
proof
Lemma A.3
Remark A.4

Your Diffusion Model is Secretly a Certifiably Robust Classifier

TL;DR

Abstract

Your Diffusion Model is Secretly a Certifiably Robust Classifier

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (10)