Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights

Dixi Yao

Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights

Dixi Yao

TL;DR

This paper designs and builds a variational network autoencoder that takes model weights as input and outputs the reconstruction of private images, and proposes a training paradigm with the help of timestep embedding that gives a surprising answer to this research question.

Abstract

With the emerging trend in generative models and convenient public access to diffusion models pre-trained on large datasets, users can fine-tune these models to generate images of personal faces or items in new contexts described by natural language. Parameter efficient fine-tuning (PEFT) such as Low Rank Adaptation (LoRA) has become the most common way to save memory and computation usage on the user end during fine-tuning. However, a natural question is whether the private images used for fine-tuning will be leaked to adversaries when sharing model weights. In this paper, we study the issue of privacy leakage of a fine-tuned diffusion model in a practical setting, where adversaries only have access to model weights, rather than prompts or images used for fine-tuning. We design and build a variational network autoencoder that takes model weights as input and outputs the reconstruction of private images. To improve the efficiency of training such an autoencoder, we propose a training paradigm with the help of timestep embedding. The results give a surprising answer to this research question: an adversary can generate images containing the same identities as the private images. Furthermore, we demonstrate that no existing defense method, including differential privacy-based methods, can preserve the privacy of private data used for fine-tuning a diffusion model without compromising the utility of a fine-tuned model.

Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights

TL;DR

Abstract

Paper Structure (46 sections, 1 theorem, 2 equations, 14 figures, 5 tables, 3 algorithms)

This paper contains 46 sections, 1 theorem, 2 equations, 14 figures, 5 tables, 3 algorithms.

Introduction
Background and Related Work
Text-conditioned Diffusion Model
Fine-Tuning Diffusion Model With Dreambooth and LoRA
Memorization in Diffusion Model
Data Reconstruction from Model Weights
Defense Against Data Reconstruction
Motivation and Threat Model
Memorization During Fine-Tuning
Privacy Risks
Definition of Privacy Leakage.
Assumptions in the Previous Literature.
Our Assumptions.
Threat Model
Quantify the Threats
...and 31 more sections

Key Result

Theorem 6.1

A randomized mechanism $\mathcal{M}$ mapping data from $\mathcal{X}$ on to $\mathcal{Y}$ is $(\epsilon,\delta)$ differential private if for any two datasets $x$ and $x'\in \mathcal{X}$ differ by at most one entry, and for any output sets $y\in\mathcal{Y}$, it holds that

Figures (14)

Figure 1: The private images used for fine-tuning the stable diffusion V-1.4 over different types of images.
Figure 2: The overall framework of fine-tuning a diffusion model and our attack method
Figure 3: The structure of neural network encoder
Figure 4: The structure of matrix encoder where corresponding LoRA matrix is in dimension of $W\times H$. The kernel size of convolution layer is 3.
Figure 5: The generated image by stable diffusion V-1.4 fine-tuned with images of Elon Musk and reconstruction by different attack methods.
...and 9 more figures

Theorems & Definitions (1)

Theorem 6.1

Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights

TL;DR

Abstract

Risks When Sharing LoRA Fine-Tuned Diffusion Model Weights

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (1)