Improved Diffusion-based Generative Model with Better Adversarial Robustness
Zekun Wang, Mingyang Yi, Shuchen Xue, Zhenguo Li, Ming Liu, Bing Qin, Zhi-Ming Ma
TL;DR
This work tackles a key bottleneck in diffusion-based generative modeling: the distribution mismatch between training (denoising ground-truth noise) and sampling (noisy steps from prior outputs). By casting the problem in a Distributionally Robust Optimization framework, the authors show that robustness to distributional perturbations is equivalent to adversarial training for diffusion models, and they extend the same reasoning to Consistency Models. They derive a DRO-based objective that leads to an adversarial noise-prediction formulation with provable error bounds, and they translate this into efficient Free-AT implementations for both DPMs and CM. Empirically, adversarial training yields substantial improvements in sample quality (lower FID) and robustness across CIFAR-10, ImageNet 64×64, and MS-COCO 512×512, including latent consistency settings, without sacrificing convergence. The results suggest practical, scalable robustness enhancements for diffusion-based generative systems with broad applicability to image synthesis and text-to-image tasks.
Abstract
Diffusion Probabilistic Models (DPMs) have achieved significant success in generative tasks. However, their training and sampling processes suffer from the issue of distribution mismatch. During the denoising process, the input data distributions differ between the training and inference stages, potentially leading to inaccurate data generation. To obviate this, we analyze the training objective of DPMs and theoretically demonstrate that this mismatch can be alleviated through Distributionally Robust Optimization (DRO), which is equivalent to performing robustness-driven Adversarial Training (AT) on DPMs. Furthermore, for the recently proposed Consistency Model (CM), which distills the inference process of the DPM, we prove that its training objective also encounters the mismatch issue. Fortunately, this issue can be mitigated by AT as well. Based on these insights, we propose to conduct efficient AT on both DPM and CM. Finally, extensive empirical studies validate the effectiveness of AT in diffusion-based models. The code is available at https://github.com/kugwzk/AT_Diff.
