Table of Contents
Fetching ...

Your Diffusion Model is Secretly a Certifiably Robust Classifier

Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu

TL;DR

This paper studies the provable robustness of diffusion-based classifiers and shows they enjoy an $O(1)$ Lipschitz constant. It introduces Noised Diffusion Classifiers, including Exact Posterior (EPNDC) and Approximated Posterior (APNDC) variants, to classify Gaussian-corrupted data via ELBO-based log-likelihoods and Bayes' theorem, further enhanced by randomized smoothing to tighten certified radii. APNDC, acting as an ensemble of EPNDC with minimal overhead, achieves state-of-the-art or competitive certified robustness on CIFAR-10 and ImageNet64x64 using a single pre-trained diffusion model without extra data, while two variance-reduction strategies dramatically cut time complexity. The work also derives a rigorous Lipschitz bound for diffusion classifiers and introduces efficient class-selection strategies (Sift-and-Refine) to scale to large class counts, highlighting both theoretical and practical advances in robust diffusion-based classification.

Abstract

Generative learning, recognized for its effective modeling of data distributions, offers inherent advantages in handling out-of-distribution instances, especially for enhancing robustness to adversarial attacks. Among these, diffusion classifiers, utilizing powerful diffusion models, have demonstrated superior empirical robustness. However, a comprehensive theoretical understanding of their robustness is still lacking, raising concerns about their vulnerability to stronger future attacks. In this study, we prove that diffusion classifiers possess $O(1)$ Lipschitzness, and establish their certified robustness, demonstrating their inherent resilience. To achieve non-constant Lipschitzness, thereby obtaining much tighter certified robustness, we generalize diffusion classifiers to classify Gaussian-corrupted data. This involves deriving the evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. Experimental results show the superior certified robustness of these Noised Diffusion Classifiers (NDCs). Notably, we achieve over 80% and 70% certified robustness on CIFAR-10 under adversarial perturbations with \(\ell_2\) norms less than 0.25 and 0.5, respectively, using a single off-the-shelf diffusion model without any additional data.

Your Diffusion Model is Secretly a Certifiably Robust Classifier

TL;DR

This paper studies the provable robustness of diffusion-based classifiers and shows they enjoy an Lipschitz constant. It introduces Noised Diffusion Classifiers, including Exact Posterior (EPNDC) and Approximated Posterior (APNDC) variants, to classify Gaussian-corrupted data via ELBO-based log-likelihoods and Bayes' theorem, further enhanced by randomized smoothing to tighten certified radii. APNDC, acting as an ensemble of EPNDC with minimal overhead, achieves state-of-the-art or competitive certified robustness on CIFAR-10 and ImageNet64x64 using a single pre-trained diffusion model without extra data, while two variance-reduction strategies dramatically cut time complexity. The work also derives a rigorous Lipschitz bound for diffusion classifiers and introduces efficient class-selection strategies (Sift-and-Refine) to scale to large class counts, highlighting both theoretical and practical advances in robust diffusion-based classification.

Abstract

Generative learning, recognized for its effective modeling of data distributions, offers inherent advantages in handling out-of-distribution instances, especially for enhancing robustness to adversarial attacks. Among these, diffusion classifiers, utilizing powerful diffusion models, have demonstrated superior empirical robustness. However, a comprehensive theoretical understanding of their robustness is still lacking, raising concerns about their vulnerability to stronger future attacks. In this study, we prove that diffusion classifiers possess Lipschitzness, and establish their certified robustness, demonstrating their inherent resilience. To achieve non-constant Lipschitzness, thereby obtaining much tighter certified robustness, we generalize diffusion classifiers to classify Gaussian-corrupted data. This involves deriving the evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. Experimental results show the superior certified robustness of these Noised Diffusion Classifiers (NDCs). Notably, we achieve over 80% and 70% certified robustness on CIFAR-10 under adversarial perturbations with norms less than 0.25 and 0.5, respectively, using a single off-the-shelf diffusion model without any additional data.
Paper Structure (35 sections, 4 theorems, 51 equations, 2 figures, 6 tables, 6 algorithms)

This paper contains 35 sections, 4 theorems, 51 equations, 2 figures, 6 tables, 6 algorithms.

Key Result

Theorem 3.1

The upper bound of Lipschitz constant of diffusion classifier is given by: If one can get a lower bound $\underline{p_A}$ for $\text{DC}(\boldsymbol{\mathbf{x}}_0)_y$ and a upper bound $\overline{p_B}$ for $\max_{\hat{y} \neq y}\text{DC}(\boldsymbol{\mathbf{x}}_0)_{\hat{y}}$ (e.g., probabilistic bound by Bernstein inequality maurer2009empiricalbernstein), the lower bound o

Figures (2)

  • Figure 1: Illustration of our theoretical contributions. We derive the Lipschitz constant and the corresponding certified radius for diffusion classifiers chen2023robust. Additionally, we introduce two novel evidence lower bounds, which are used to approximate the log likelihood. These lower bounds are then employed to construct classifiers based on Bayes' theorem. By applying randomized smoothing to these classifiers, we derive their certified robust radii.
  • Figure 2: (a) The accuracy (%) on CIFAR-10 dataset with time complexity reduction technique in chen2023robust and ours. (b, c) The upper envelop of certified radii of different methods.

Theorems & Definitions (10)

  • Theorem 3.1
  • proof
  • Theorem 3.2
  • Remark 3.3
  • Remark 3.4
  • Remark 3.5
  • Lemma A.2
  • proof
  • Lemma A.3
  • Remark A.4