Table of Contents
Fetching ...

Enhancing Privacy in ControlNet and Stable Diffusion via Split Learning

Dixi Yao

TL;DR

A privacy-preserving activation function and a method to prevent private text prompts from leaving clients, tailored for image generation with diffusion models are proposed, tailored for image generation with diffusion models.

Abstract

With the emerging trend of large generative models, ControlNet is introduced to enable users to fine-tune pre-trained models with their own data for various use cases. A natural question arises: how can we train ControlNet models while ensuring users' data privacy across distributed devices? Exploring different distributed training schemes, we find conventional federated learning and split learning unsuitable. Instead, we propose a new distributed learning structure that eliminates the need for the server to send gradients back. Through a comprehensive evaluation of existing threats, we discover that in the context of training ControlNet with split learning, most existing attacks are ineffective, except for two mentioned in previous literature. To counter these threats, we leverage the properties of diffusion models and design a new timestep sampling policy during forward processes. We further propose a privacy-preserving activation function and a method to prevent private text prompts from leaving clients, tailored for image generation with diffusion models. Our experimental results demonstrate that our algorithms and systems greatly enhance the efficiency of distributed training for ControlNet while ensuring users' data privacy without compromising image generation quality.

Enhancing Privacy in ControlNet and Stable Diffusion via Split Learning

TL;DR

A privacy-preserving activation function and a method to prevent private text prompts from leaving clients, tailored for image generation with diffusion models are proposed, tailored for image generation with diffusion models.

Abstract

With the emerging trend of large generative models, ControlNet is introduced to enable users to fine-tune pre-trained models with their own data for various use cases. A natural question arises: how can we train ControlNet models while ensuring users' data privacy across distributed devices? Exploring different distributed training schemes, we find conventional federated learning and split learning unsuitable. Instead, we propose a new distributed learning structure that eliminates the need for the server to send gradients back. Through a comprehensive evaluation of existing threats, we discover that in the context of training ControlNet with split learning, most existing attacks are ineffective, except for two mentioned in previous literature. To counter these threats, we leverage the properties of diffusion models and design a new timestep sampling policy during forward processes. We further propose a privacy-preserving activation function and a method to prevent private text prompts from leaving clients, tailored for image generation with diffusion models. Our experimental results demonstrate that our algorithms and systems greatly enhance the efficiency of distributed training for ControlNet while ensuring users' data privacy without compromising image generation quality.
Paper Structure (38 sections, 1 theorem, 9 equations, 11 figures, 4 tables)

This paper contains 38 sections, 1 theorem, 9 equations, 11 figures, 4 tables.

Key Result

Theorem 6.1

($(\epsilon,\Delta)$-$\rm{LDP}$ timestep sampling mechanism in diffusion model) With a given privacy budget $\epsilon$, we can have a sampling process in diffusion model which is $(\epsilon,\Delta)$-$\rm{LDP}$. The value of $\epsilon$ is set by a timestep ranging in $[t_s,t_{\max}]$ and scheduling p

Figures (11)

  • Figure 1: Examples of images output from the decoder corresponding to latent representation at different timesteps during forward and sampling process.
  • Figure 2: The image on the right is generated from the ControlNet with a condition image on the left. The condition image is an image of depth maps. The text prompt is: Stormtrooper's lecture.
  • Figure 3: The image on the right is generated from the ControlNet trained by FedAvg using the condition image on the left. The text prompt is: A skier poses for a shot on the night time slopes.
  • Figure 4: The deployment of ControlNet across the clients and the server under split learning framework and our proposed privacy-preserving framework during training. Function $f$ is $z_t=f(z_0,t,\hat{n})=\sqrt{\alpha_t}z_0+\sqrt{1-\alpha_t}\hat{n}$.
  • Figure 5: The illustration of different inversion attacks.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Definition 3.1
  • Definition 3.2
  • Theorem 6.1