Privacy-Preserving Low-Rank Adaptation against Membership Inference Attacks for Latent Diffusion Models
Zihao Luo, Xilie Xu, Feng Liu, Yun Sing Koh, Di Wang, Jingfeng Zhang
TL;DR
The paper tackles privacy leakage from membership inference (MI) attacks on LoRA-adapted latent diffusion models (LDMs) by introducing MP-LoRA, a min–max defense, and its stabilized successor SMP-LoRA. MP-LoRA combines an adaptation loss with MI gain from a proxy attacker, but suffers from unstable optimization due to unconstrained local smoothness. SMP-LoRA fixes this by placing the MI gain in the denominator of the objective, yielding a gradient-norm–bounded smoothness that improves convergence and privacy protection. Empirical results on Pokemon, CelebA, and other datasets show SMP-LoRA achieves near-random MI attack performance while preserving high-quality image generation, and ablations demonstrate robustness to hyperparameters and extension to full fine-tuning and DreamBooth. The work provides a practical, theoretically grounded approach to privacy-preserving LDM personalization with publicly available code.
Abstract
Low-rank adaptation (LoRA) is an efficient strategy for adapting latent diffusion models (LDMs) on a private dataset to generate specific images by minimizing the adaptation loss. However, the LoRA-adapted LDMs are vulnerable to membership inference (MI) attacks that can judge whether a particular data point belongs to the private dataset, thus leading to the privacy leakage. To defend against MI attacks, we first propose a straightforward solution: Membership-Privacy-preserving LoRA (MP-LoRA). MP-LoRA is formulated as a min-max optimization problem where a proxy attack model is trained by maximizing its MI gain while the LDM is adapted by minimizing the sum of the adaptation loss and the MI gain of the proxy attack model. However, we empirically find that MP-LoRA has the issue of unstable optimization, and theoretically analyze that the potential reason is the unconstrained local smoothness, which impedes the privacy-preserving adaptation. To mitigate this issue, we further propose a Stable Membership-Privacy-preserving LoRA (SMP-LoRA) that adapts the LDM by minimizing the ratio of the adaptation loss to the MI gain. Besides, we theoretically prove that the local smoothness of SMP-LoRA can be constrained by the gradient norm, leading to improved convergence. Our experimental results corroborate that SMP-LoRA can indeed defend against MI attacks and generate high-quality images. Our Code is available at \url{https://github.com/WilliamLUO0/StablePrivateLoRA}.
