Table of Contents
Fetching ...

Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?

Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, Prateek Mittal

TL;DR

The paper investigates improving adversarial robustness by using proxy distributions from generative models, providing a theoretical bound on robustness transfer via conditional Wasserstein distance and introducing ARC as a practical proxy-quality metric. It then presents PORT, a robust-training framework that blends real and synthetic data and selects synthetic samples using robust discriminators. Empirical results across five datasets show substantial gains in both empirical and certified robustness, with diffusion-based proxies consistently outperforming GANs and ARC reliably predicting proxy effectiveness. The work also offers mechanisms for adaptive sampling and sample-quality assessment to maximize robustness transfer while maintaining efficiency. Overall, the approach demonstrates that carefully chosen synthetic data can meaningfully enhance robustness without relying on large real-world datasets.

Abstract

While additional training data improves the robustness of deep neural networks against adversarial examples, it presents the challenge of curating a large number of specific real-world samples. We circumvent this challenge by using additional data from proxy distributions learned by advanced generative models. We first seek to formally understand the transfer of robustness from classifiers trained on proxy distributions to the real data distribution. We prove that the difference between the robustness of a classifier on the two distributions is upper bounded by the conditional Wasserstein distance between them. Next we use proxy distributions to significantly improve the performance of adversarial training on five different datasets. For example, we improve robust accuracy by up to 7.5% and 6.7% in $\ell_{\infty}$ and $\ell_2$ threat model over baselines that are not using proxy distributions on the CIFAR-10 dataset. We also improve certified robust accuracy by 7.6% on the CIFAR-10 dataset. We further demonstrate that different generative models bring a disparate improvement in the performance in robust training. We propose a robust discrimination approach to characterize the impact of individual generative models and further provide a deeper understanding of why current state-of-the-art in diffusion-based generative models are a better choice for proxy distribution than generative adversarial networks.

Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?

TL;DR

The paper investigates improving adversarial robustness by using proxy distributions from generative models, providing a theoretical bound on robustness transfer via conditional Wasserstein distance and introducing ARC as a practical proxy-quality metric. It then presents PORT, a robust-training framework that blends real and synthetic data and selects synthetic samples using robust discriminators. Empirical results across five datasets show substantial gains in both empirical and certified robustness, with diffusion-based proxies consistently outperforming GANs and ARC reliably predicting proxy effectiveness. The work also offers mechanisms for adaptive sampling and sample-quality assessment to maximize robustness transfer while maintaining efficiency. Overall, the approach demonstrates that carefully chosen synthetic data can meaningfully enhance robustness without relying on large real-world datasets.

Abstract

While additional training data improves the robustness of deep neural networks against adversarial examples, it presents the challenge of curating a large number of specific real-world samples. We circumvent this challenge by using additional data from proxy distributions learned by advanced generative models. We first seek to formally understand the transfer of robustness from classifiers trained on proxy distributions to the real data distribution. We prove that the difference between the robustness of a classifier on the two distributions is upper bounded by the conditional Wasserstein distance between them. Next we use proxy distributions to significantly improve the performance of adversarial training on five different datasets. For example, we improve robust accuracy by up to 7.5% and 6.7% in and threat model over baselines that are not using proxy distributions on the CIFAR-10 dataset. We also improve certified robust accuracy by 7.6% on the CIFAR-10 dataset. We further demonstrate that different generative models bring a disparate improvement in the performance in robust training. We propose a robust discrimination approach to characterize the impact of individual generative models and further provide a deeper understanding of why current state-of-the-art in diffusion-based generative models are a better choice for proxy distribution than generative adversarial networks.

Paper Structure

This paper contains 29 sections, 7 theorems, 38 equations, 15 figures, 12 tables.

Key Result

Theorem 1

Let $D$ and $\tilde{D}$ be two labeled distributions supported on ${\mathcal{X}}\times {\mathcal{Y}}$ with identical label distributions, i.e., $\forall y^* \in {\mathcal{Y}}, \Pr_{(x,y)\gets D}[y=y^*] = \Pr_{(x,y)\gets \tilde{D}}[y=y^*]$. Then for any classifier $h:{\mathcal{X}}\to {\mathcal{Y}}$

Figures (15)

  • Figure 1: Original samples.
  • Figure 2: Perturbed samples.
  • Figure 4: Comparing generative models. When choosing proxy distribution in PORT ($\ell_{\infty}$), DDPM model outperforms the leading generative adversarial network (GAN) on each dataset.
  • Figure 5: Calculating ARC. For every generative model and perturbation budget ($\epsilon$), we first adversarially train a binary classifier on adversarial perturbed synthetic and CIFAR-10 images. Next we measure its robust discrimination accuracy on the validation set at the $\epsilon$ value used in training. ARC is the area under the robust discrimination accuracy vs $\epsilon$ curve.
  • Figure 6: Validating the upper bound from Theorem 1. The green line is the upper bound calculated by Wasserstein-2. Note that the Wasserstein-1 (The bound of Theorem 1) is a tighter upper-bound but it does not have a closed form for normal distributions.
  • ...and 10 more figures

Theorems & Definitions (17)

  • Definition 1: Average Robustness
  • Definition 2: Conditional Wasserstein distance
  • Theorem 1: Bounding distribution-shift penalty
  • Theorem 2
  • proof : Sketch of the proof
  • Lemma 3
  • proof
  • proof : Full proof
  • proof
  • Lemma 4
  • ...and 7 more