Table of Contents
Fetching ...

What is Adversarial Training for Diffusion Models?

Briglia Maria Rosaria, Mujtaba Hussain Mirza, Giuseppe Lisanti, Iacopo Masi

TL;DR

This paper addresses the robustness of diffusion models to noisy, outlier, and adversarial data by introducing adversarial training specifically for diffusion models (AT-DM). It shows that, unlike classifier AT which enforces invariance, DM AT must enforce equivariance to keep the diffusion process aligned with the data distribution, achieved by perturbing diffusion trajectories with time-varying noise and incorporating a dedicated regularizer. The authors formalize an AT objective, combine it with the standard denoising loss, and demonstrate through synthetic (low-dimensional) and real-world (CIFAR-10, CelebA, LSUN Bedroom) experiments that Robust$_{\text{adv}}$ yields smoother diffusion flows, reduced memorization, and faster sampling, while improving robustness to severe noise and adversarial attacks. This work broadens the applicability of DMs in noisy or adversarial settings and suggests practical benefits for denoising data distributions and potential adversarial purification tasks in real-world deployments.

Abstract

We answer the question in the title, showing that adversarial training (AT) for diffusion models (DMs) fundamentally differs from classifiers: while AT in classifiers enforces output invariance, AT in DMs requires equivariance to keep the diffusion process aligned with the data distribution. AT is a way to enforce smoothness in the diffusion flow, improving robustness to outliers and corrupted data. Unlike prior art, our method makes no assumptions about the noise model and integrates seamlessly into diffusion training by adding random noise, similar to randomized smoothing, or adversarial noise, akin to AT. This enables intrinsic capabilities such as handling noisy data, dealing with extreme variability such as outliers, preventing memorization, and improving robustness. We rigorously evaluate our approach with proof-of-concept datasets with known distributions in low- and high-dimensional space, thereby taking a perfect measure of errors; we further evaluate on standard benchmarks such as CIFAR-10, CelebA and LSUN Bedroom, showing strong performance under severe noise, data corruption, and iterative adversarial attacks.

What is Adversarial Training for Diffusion Models?

TL;DR

This paper addresses the robustness of diffusion models to noisy, outlier, and adversarial data by introducing adversarial training specifically for diffusion models (AT-DM). It shows that, unlike classifier AT which enforces invariance, DM AT must enforce equivariance to keep the diffusion process aligned with the data distribution, achieved by perturbing diffusion trajectories with time-varying noise and incorporating a dedicated regularizer. The authors formalize an AT objective, combine it with the standard denoising loss, and demonstrate through synthetic (low-dimensional) and real-world (CIFAR-10, CelebA, LSUN Bedroom) experiments that Robust yields smoother diffusion flows, reduced memorization, and faster sampling, while improving robustness to severe noise and adversarial attacks. This work broadens the applicability of DMs in noisy or adversarial settings and suggests practical benefits for denoising data distributions and potential adversarial purification tasks in real-world deployments.

Abstract

We answer the question in the title, showing that adversarial training (AT) for diffusion models (DMs) fundamentally differs from classifiers: while AT in classifiers enforces output invariance, AT in DMs requires equivariance to keep the diffusion process aligned with the data distribution. AT is a way to enforce smoothness in the diffusion flow, improving robustness to outliers and corrupted data. Unlike prior art, our method makes no assumptions about the noise model and integrates seamlessly into diffusion training by adding random noise, similar to randomized smoothing, or adversarial noise, akin to AT. This enables intrinsic capabilities such as handling noisy data, dealing with extreme variability such as outliers, preventing memorization, and improving robustness. We rigorously evaluate our approach with proof-of-concept datasets with known distributions in low- and high-dimensional space, thereby taking a perfect measure of errors; we further evaluate on standard benchmarks such as CIFAR-10, CelebA and LSUN Bedroom, showing strong performance under severe noise, data corruption, and iterative adversarial attacks.

Paper Structure

This paper contains 37 sections, 37 equations, 29 figures, 2 tables, 3 algorithms.

Figures (29)

  • Figure 1: Inducing smoothness into diffusion trajectories. We train the denoising network to follow the score function i.e., ${\mathbf{x}}_{t} \mapsto {\mathbf{x}}_{t-1}$ using just ${\boldsymbol{\epsilon}}_{\theta}({\mathbf{x}}_t,t)$, but we also perturb locally the data point as ${\mathbf{x}}_{t}{+}\boldsymbol{\delta}$ inside a $\ell_p$ ball centered on ${\mathbf{x}}_t$ and then imposing equivariance: ${\mathbf{x}}_{t}{+}\boldsymbol{\delta} \mapsto {\boldsymbol{\epsilon}}_{\theta}({\mathbf{x}}_t,t)+\boldsymbol{\delta} \triangleq {\mathbf{x}}_{t-1}.$ This equals to adding an intermediate step in the Markov Chain, behaving as an additional denoising step in the training, making the model resilient to possible outliers or noise in the dataset---$p_{\text{noise}}(\mathbf{x}_0)$---not proper of $p_{\text{data}}(\mathbf{x}_0)$. The local noising step can be implemented as adversarial goodfellow2014explaining or as random, akin to randomized smoothing cohen2019certified. Perturbation strength is adaptive, large in the noise phase and it shrinks in the content phase. indicates the forward process; the reverse process.
  • Figure 1: Top: Random vs adv. noise. Bottom: Robust$_{\text{adv}}$ allows fewer steps for better FID. Results on CIFAR-10.
  • Figure 2: (a) Handling different types of noise. The leftmost shows training data either with strong inlier noise (top) or uniform outliers (bottom). The trajectories reveal that DDPM ho2020denoising struggles with both, while if you train with invariance (Inv$_\text{adv}$) the process diverges. Instead ours (Robust$_{\text{adv}}$) is more robust, avoiding diverging trajectories and better reaching the data centroid. (b) Score vector fields: versors represent the score field, colormap shows magnitude, less more intense. (left) Ground-truth (middle) DDPM ho2020denoising; (right) Our Robust$_{\text{adv}}$. AT yields smoother, more consistent scores, better matching the data shape, while shrinking variability and increasing field intensity. (c) Adversarial perturbation ray. The curves vary with $\omega$, controlling the slope of $\sqrt{1-\alpha_t}~~r(t)$ to shorten the content phase and reduce the curve's steepness in DDPM.
  • Figure 3: (left) On the linearized butterflies dataset, we measure closed-form reconstruction error. From top to bottom: training data, corrupted data, DDPM generations, and Robust$_{\text{adv}}$ results. (right) Metric plots: first column shows PSNR, second column the closed-form reconstruction error. First row: clean data; second row: 90% of data corrupted with Gaussian noise ($\sigma = 0.1$). We also include an ablation on invariance regularization with $\lambda=\{0.3$, $0.03\}$. Strong invariance prevents learning the distribution; reducing $\lambda$ helps but is still far below other methods.
  • Figure 4: (top-left) Despite 90% of training data being corrupted with Gaussian noise, Robust$_{\text{adv}}$ generates smooth objects without artifacts, while DDPM retains noise. $\sigma=0.2$ equals adding $40\%$ of CIFAR-10 variability ($\sigma_{\text{data}}=0.5$). (top-right) DDPM generates bedrooms that are irregular and unrealistic propagating the noise whereas Robust$_{\text{adv}}$ bedrooms are smooth and neat. (bottom) Results on CelebA. DDPM replicates noise, while ours discards it and produces cleaner faces.
  • ...and 24 more figures