Efficient Differentially Private Fine-Tuning of Diffusion Models
Jing Liu, Andrew Lowy, Toshiaki Koike-Akino, Kieran Parsons, Ye Wang
TL;DR
The paper tackles the high resource cost of differentially private fine-tuning for diffusion models by introducing DP-LoDA, a parameter-efficient approach that attaches Low-Dimensional Adaptation (LoDA) adapters to convolutional layers while freezing the base diffusion model. These adapters are trained with DP-SGD on private data, producing class-conditioned synthetic samples used to train downstream classifiers. Experiments on CIFAR-10 and MNIST show that DP-LoDA outperforms standard DP-SGD and is competitive with fully fine-tuned DP diffusion methods, with larger gains when private data is limited. The work demonstrates that privacy-preserving diffusion-model fine-tuning can be both computation-efficient and practically useful for downstream tasks, and it suggests future exploration with Latent Diffusion Models and broader impact analyses, with privacy budgets $$( ext{oldmath$\e^{ ext{PI}}$}, ext{oldmath$\delta$})$$$ wrapped appropriately in mathematical notation.
Abstract
The recent developments of Diffusion Models (DMs) enable generation of astonishingly high-quality synthetic samples. Recent work showed that the synthetic samples generated by the diffusion model, which is pre-trained on public data and fully fine-tuned with differential privacy on private data, can train a downstream classifier, while achieving a good privacy-utility tradeoff. However, fully fine-tuning such large diffusion models with DP-SGD can be very resource-demanding in terms of memory usage and computation. In this work, we investigate Parameter-Efficient Fine-Tuning (PEFT) of diffusion models using Low-Dimensional Adaptation (LoDA) with Differential Privacy. We evaluate the proposed method with the MNIST and CIFAR-10 datasets and demonstrate that such efficient fine-tuning can also generate useful synthetic samples for training downstream classifiers, with guaranteed privacy protection of fine-tuning data. Our source code will be made available on GitHub.
