Learning Differentially Private Diffusion Models via Stochastic Adversarial Distillation
Bochao Liu, Pengju Wang, Shiming Ge
TL;DR
Privacy is a major barrier to releasing synthetic data from sensitive domains. DP-SAD introduces a three-component diffusion framework with a private teacher, a private student, and a discriminator that are trained through a combination of adversarial and stochastic distillation steps, leveraging diffusion time steps $T$ to dilute DP noise. It provides a Rényi differential privacy analysis and uses a Gaussian mechanism with gradient clipping to establish DP guarantees, achieving high utility under DP budgets. Empirically, DP-SAD outperforms 11 baselines on MNIST, FMNIST, and CelebA in terms of perceptual metrics and downstream classifier performance, demonstrating effective private diffusion-based data generation with practical training efficiency.
Abstract
While the success of deep learning relies on large amounts of training datasets, data is often limited in privacy-sensitive domains. To address this challenge, generative model learning with differential privacy has emerged as a solution to train private generative models for desensitized data generation. However, the quality of the images generated by existing methods is limited due to the complexity of modeling data distribution. We build on the success of diffusion models and introduce DP-SAD, which trains a private diffusion model by a stochastic adversarial distillation method. Specifically, we first train a diffusion model as a teacher and then train a student by distillation, in which we achieve differential privacy by adding noise to the gradients from other models to the student. For better generation quality, we introduce a discriminator to distinguish whether an image is from the teacher or the student, which forms the adversarial training. Extensive experiments and analysis clearly demonstrate the effectiveness of our proposed method.
