Differentially Private Fine-Tuning of Diffusion Models
Yu-Lin Tsai, Yizhe Li, Zekai Chen, Po-Yu Chen, Chia-Mu Yu, Xuebin Ren, Francois Buet-Golfouse
TL;DR
This work addresses privacy risks in diffusion-model training by developing a parameter-efficient DP-fine-tuning method, DP-LoRA, that minimizes trainable parameters while preserving high-quality image synthesis under differential privacy. The approach uses a two-stage pipeline: public pretraining of an autoencoder and latent diffusion model, followed by private fine-tuning with Low-Rank Adaptation (LoRA) added to attention and projection layers, all trained with DP-SGD. Comprehensive experiments across MNIST, CIFAR-10, Fashion-MNIST, CelebA, and CelebA-HQ demonstrate SoTA performance in DP synthesis, with strong results in both conditional and unconditional generation and favorable privacy-utility trade-offs. The findings highlight the value of parameter-efficient fine-tuning for scalable, privacy-preserving generative models and offer practical guidance on which components to optimize, achieving robust results under small privacy budgets and enabling rapid adaptation to downstream tasks.
Abstract
The integration of Differential Privacy (DP) with diffusion models (DMs) presents a promising yet challenging frontier, particularly due to the substantial memorization capabilities of DMs that pose significant privacy risks. Differential privacy offers a rigorous framework for safeguarding individual data points during model training, with Differential Privacy Stochastic Gradient Descent (DP-SGD) being a prominent implementation. Diffusion method decomposes image generation into iterative steps, theoretically aligning well with DP's incremental noise addition. Despite the natural fit, the unique architecture of DMs necessitates tailored approaches to effectively balance privacy-utility trade-off. Recent developments in this field have highlighted the potential for generating high-quality synthetic data by pre-training on public data (i.e., ImageNet) and fine-tuning on private data, however, there is a pronounced gap in research on optimizing the trade-offs involved in DP settings, particularly concerning parameter efficiency and model scalability. Our work addresses this by proposing a parameter-efficient fine-tuning strategy optimized for private diffusion models, which minimizes the number of trainable parameters to enhance the privacy-utility trade-off. We empirically demonstrate that our method achieves state-of-the-art performance in DP synthesis, significantly surpassing previous benchmarks on widely studied datasets (e.g., with only 0.47M trainable parameters, achieving a more than 35% improvement over the previous state-of-the-art with a small privacy budget on the CelebA-64 dataset). Anonymous codes available at https://anonymous.4open.science/r/DP-LORA-F02F.
