Table of Contents
Fetching ...

Privacy-Preserving Low-Rank Adaptation against Membership Inference Attacks for Latent Diffusion Models

Zihao Luo, Xilie Xu, Feng Liu, Yun Sing Koh, Di Wang, Jingfeng Zhang

TL;DR

The paper tackles privacy leakage from membership inference (MI) attacks on LoRA-adapted latent diffusion models (LDMs) by introducing MP-LoRA, a min–max defense, and its stabilized successor SMP-LoRA. MP-LoRA combines an adaptation loss with MI gain from a proxy attacker, but suffers from unstable optimization due to unconstrained local smoothness. SMP-LoRA fixes this by placing the MI gain in the denominator of the objective, yielding a gradient-norm–bounded smoothness that improves convergence and privacy protection. Empirical results on Pokemon, CelebA, and other datasets show SMP-LoRA achieves near-random MI attack performance while preserving high-quality image generation, and ablations demonstrate robustness to hyperparameters and extension to full fine-tuning and DreamBooth. The work provides a practical, theoretically grounded approach to privacy-preserving LDM personalization with publicly available code.

Abstract

Low-rank adaptation (LoRA) is an efficient strategy for adapting latent diffusion models (LDMs) on a private dataset to generate specific images by minimizing the adaptation loss. However, the LoRA-adapted LDMs are vulnerable to membership inference (MI) attacks that can judge whether a particular data point belongs to the private dataset, thus leading to the privacy leakage. To defend against MI attacks, we first propose a straightforward solution: Membership-Privacy-preserving LoRA (MP-LoRA). MP-LoRA is formulated as a min-max optimization problem where a proxy attack model is trained by maximizing its MI gain while the LDM is adapted by minimizing the sum of the adaptation loss and the MI gain of the proxy attack model. However, we empirically find that MP-LoRA has the issue of unstable optimization, and theoretically analyze that the potential reason is the unconstrained local smoothness, which impedes the privacy-preserving adaptation. To mitigate this issue, we further propose a Stable Membership-Privacy-preserving LoRA (SMP-LoRA) that adapts the LDM by minimizing the ratio of the adaptation loss to the MI gain. Besides, we theoretically prove that the local smoothness of SMP-LoRA can be constrained by the gradient norm, leading to improved convergence. Our experimental results corroborate that SMP-LoRA can indeed defend against MI attacks and generate high-quality images. Our Code is available at \url{https://github.com/WilliamLUO0/StablePrivateLoRA}.

Privacy-Preserving Low-Rank Adaptation against Membership Inference Attacks for Latent Diffusion Models

TL;DR

The paper tackles privacy leakage from membership inference (MI) attacks on LoRA-adapted latent diffusion models (LDMs) by introducing MP-LoRA, a min–max defense, and its stabilized successor SMP-LoRA. MP-LoRA combines an adaptation loss with MI gain from a proxy attacker, but suffers from unstable optimization due to unconstrained local smoothness. SMP-LoRA fixes this by placing the MI gain in the denominator of the objective, yielding a gradient-norm–bounded smoothness that improves convergence and privacy protection. Empirical results on Pokemon, CelebA, and other datasets show SMP-LoRA achieves near-random MI attack performance while preserving high-quality image generation, and ablations demonstrate robustness to hyperparameters and extension to full fine-tuning and DreamBooth. The work provides a practical, theoretically grounded approach to privacy-preserving LDM personalization with publicly available code.

Abstract

Low-rank adaptation (LoRA) is an efficient strategy for adapting latent diffusion models (LDMs) on a private dataset to generate specific images by minimizing the adaptation loss. However, the LoRA-adapted LDMs are vulnerable to membership inference (MI) attacks that can judge whether a particular data point belongs to the private dataset, thus leading to the privacy leakage. To defend against MI attacks, we first propose a straightforward solution: Membership-Privacy-preserving LoRA (MP-LoRA). MP-LoRA is formulated as a min-max optimization problem where a proxy attack model is trained by maximizing its MI gain while the LDM is adapted by minimizing the sum of the adaptation loss and the MI gain of the proxy attack model. However, we empirically find that MP-LoRA has the issue of unstable optimization, and theoretically analyze that the potential reason is the unconstrained local smoothness, which impedes the privacy-preserving adaptation. To mitigate this issue, we further propose a Stable Membership-Privacy-preserving LoRA (SMP-LoRA) that adapts the LDM by minimizing the ratio of the adaptation loss to the MI gain. Besides, we theoretically prove that the local smoothness of SMP-LoRA can be constrained by the gradient norm, leading to improved convergence. Our experimental results corroborate that SMP-LoRA can indeed defend against MI attacks and generate high-quality images. Our Code is available at \url{https://github.com/WilliamLUO0/StablePrivateLoRA}.
Paper Structure (46 sections, 3 theorems, 19 equations, 4 figures, 11 tables, 2 algorithms)

This paper contains 46 sections, 3 theorems, 19 equations, 4 figures, 11 tables, 2 algorithms.

Key Result

Lemma 1

Let $f$ be a second-order differentiable function and $(L_0, L1)$-smooth. If the local smoothness, quantified by the Hessian norm (the norm of the Hessian matrix), is positively correlated with the gradient norm (i.e., $L_1 > 0$), then the gradient norm upper bounds the local smoothness, facilitatin

Figures (4)

  • Figure 1: Figure \ref{['fig:fig1a_TrainingLoss']} shows the trajectory of the training loss during the adaptation process via LoRA, MP-LoRA, and SMP-LoRA on the Pokemon dataset. Figure \ref{['fig:fig1b_GradientHessian']} displays the mean and standard deviation of the gradient norms and Hessian norms for MP-LoRA and SMP-LoRA throughout the training iterations. It also presents the Pearson correlation coefficients (PCC) and p-values assessing their correlation. Note that each epoch contains 433 training iterations. Figures \ref{['fig:fig1c_GeneratedResults']} and \ref{['fig:fig1d_Performance']} demonstrate the generated images and a comparison of evaluation metrics including FID Score and MI attack success rate (ASR). MP-LoRA preserves membership privacy but compromises image generation capability. In contrast, SMP-LoRA effectively preserves membership privacy while maintaining the quality of the generated image, demonstrating its effectiveness in defending against MI attacks without significant loss of functionality. Extensive generated images are visualized in Appendix \ref{['app:Visul_mpsmplora']}.
  • Figure 2: Figure \ref{['fig:fig2a_GradScale']} shows the mean and standard deviation of gradient scales obtained by training loss throughout the training iterations of LoRA, MP-LoRA, and SMP-LoRA on the Pokemon dataset. Figure \ref{['fig:fig2b_PLoRAGradScale']} and \ref{['fig:fig2c_SPLoRAGradScale']} display the gradient scales obtained by the adaptation loss and MI gain respectively during MP-LoRA and SMP-LoRA. Compared with MP-LoRA, SMP-LoRA has more stable gradient and controlled gradient scale.
  • Figure 3: ROC curves for SMP-LoRA with different $\lambda$ on the Pokemon, CelebA$\_$Small, and CelebA$\_$Large datasets.
  • Figure 4: Generated results on the CelebA$\_$Small and CelebA$\_$Large datasets. Each column of three images is generated using the same text prompt.

Theorems & Definitions (6)

  • Definition 1: Relaxed Smoothness Condition from Zhang2019gradient
  • Lemma 1: Zhang2019gradient
  • Proposition 1
  • proof
  • Proposition 2
  • proof