Table of Contents
Fetching ...

Expanding the Role of Diffusion Models for Robust Classifier Training

Pin-Han Huang, Shang-Tse Chen, Hsuan-Tien Lin

TL;DR

The paper investigates expanding the role of diffusion models in robust classification beyond synthetic data generation by leveraging diffusion representations as an auxiliary learning signal during adversarial training. The proposed Diffusion Representation Alignment (DRA) aligns classifier representations with frozen diffusion representations through a lightweight projection head, yielding consistent gains in clean and robust accuracy when combined with diffusion-generated data (DM-AT+DRA). Empirical results across CIFAR-10/100 and ImageNet show that diffusion representations contribute diverse, partially robust features, while synthetic data lowers representation rank, indicating complementary mechanisms. The findings highlight a practical path to more robust classifiers by jointly exploiting diffusion representations and diffusion-derived data, potentially guiding future diffusion-assisted robustness techniques and representation learning analyses.

Abstract

Incorporating diffusion-generated synthetic data into adversarial training (AT) has been shown to substantially improve the training of robust image classifiers. In this work, we extend the role of diffusion models beyond merely generating synthetic data, examining whether their internal representations, which encode meaningful features of the data, can provide additional benefits for robust classifier training. Through systematic experiments, we show that diffusion models offer representations that are both diverse and partially robust, and that explicitly incorporating diffusion representations as an auxiliary learning signal during AT consistently improves robustness across settings. Furthermore, our representation analysis indicates that incorporating diffusion models into AT encourages more disentangled features, while diffusion representations and diffusion-generated synthetic data play complementary roles in shaping representations. Experiments on CIFAR-10, CIFAR-100, and ImageNet validate these findings, demonstrating the effectiveness of jointly leveraging diffusion representations and synthetic data within AT.

Expanding the Role of Diffusion Models for Robust Classifier Training

TL;DR

The paper investigates expanding the role of diffusion models in robust classification beyond synthetic data generation by leveraging diffusion representations as an auxiliary learning signal during adversarial training. The proposed Diffusion Representation Alignment (DRA) aligns classifier representations with frozen diffusion representations through a lightweight projection head, yielding consistent gains in clean and robust accuracy when combined with diffusion-generated data (DM-AT+DRA). Empirical results across CIFAR-10/100 and ImageNet show that diffusion representations contribute diverse, partially robust features, while synthetic data lowers representation rank, indicating complementary mechanisms. The findings highlight a practical path to more robust classifiers by jointly exploiting diffusion representations and diffusion-derived data, potentially guiding future diffusion-assisted robustness techniques and representation learning analyses.

Abstract

Incorporating diffusion-generated synthetic data into adversarial training (AT) has been shown to substantially improve the training of robust image classifiers. In this work, we extend the role of diffusion models beyond merely generating synthetic data, examining whether their internal representations, which encode meaningful features of the data, can provide additional benefits for robust classifier training. Through systematic experiments, we show that diffusion models offer representations that are both diverse and partially robust, and that explicitly incorporating diffusion representations as an auxiliary learning signal during AT consistently improves robustness across settings. Furthermore, our representation analysis indicates that incorporating diffusion models into AT encourages more disentangled features, while diffusion representations and diffusion-generated synthetic data play complementary roles in shaping representations. Experiments on CIFAR-10, CIFAR-100, and ImageNet validate these findings, demonstrating the effectiveness of jointly leveraging diffusion representations and synthetic data within AT.
Paper Structure (21 sections, 4 equations, 10 figures, 3 tables)

This paper contains 21 sections, 4 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: We plot robust accuracy and representation similarity scores huh2024position for CIFAR-10 $\ell_\infty$-robust models from RobustBench croce2021robustbench. Similarity scores are measured with respect to representations extracted from the diffusion model. Implementation details and discussion are in Appendix \ref{['app:imp_alignment_trend']}.
  • Figure 2: Overview of Diffusion Representation Alignment (DRA). We leverage an auxiliary projection head to align classifiers with the extracted diffusion representations.
  • Figure 3: (a) The frequency saliency analysis of the linear-probed diffusion representation, adversarial trained robust model, and standard trained non-robust model. Low frequencies are being centered. (b) The CIFAR-10 robust accuracy across perturbation budgets for the linear probed diffusion representation (DR), adversarial trained robust model (AT), and standard trained non-robust model (ST).
  • Figure 4: Alignment and uniformity metrics on CIFAR-10 for the standard-trained model (ST), the adversarially trained model (AT), and the diffusion representations (DR).
  • Figure 5: We plot the alignment and uniformity metrics, along with clean and robust accuracy on CIFAR-10 ($\ell_\infty=8/255$) shown in parentheses.
  • ...and 5 more figures