Table of Contents
Fetching ...

Real-time Identity Defenses against Malicious Personalization of Diffusion Models

Hanzhong Guo, Shen Nie, Chao Du, Tianyu Pang, Hao Sun, Chongxuan Li

TL;DR

The paper tackles the social and security risks of identity replication by personalized diffusion models and introduces RID, a real-time defender that generates protective perturbations in a single forward pass. RID relies on Adv-SDS and a regularization term, trained on a large multi-person face dataset, to disrupt model personalization while preserving image quality. An ensemble extension (RID-Ensemble) further enhances robustness against black-box attackers and post-processing. With edge-friendly inference speeds and competitive protection, RID offers a practical solution for safeguarding portrait rights in real-world, real-time scenarios.

Abstract

Personalized generative diffusion models, capable of synthesizing highly realistic images based on a few reference portraits, may pose substantial social, ethical, and legal risks via identity replication. Existing defense mechanisms rely on computationally intensive adversarial perturbations tailored to individual images, rendering them impractical for real-world deployment. This study introduces the Real-time Identity Defender (RID), a neural network designed to generate adversarial perturbations through a single forward pass, bypassing the need for image-specific optimization. RID achieves unprecedented efficiency, with defense times as low as 0.12 seconds on a single NVIDIA A100 80G GPU (4,400 times faster than leading methods) and 1.1 seconds per image on a standard Intel i9 CPU, making it suitable for edge devices such as smartphones. Despite its efficiency, RID achieves promising protection performance across visual and quantitative benchmarks, effectively mitigating identity replication risks. Our analysis reveals that RID's perturbations mimic the efficacy of traditional defenses while exhibiting properties distinct from natural noise, such as Gaussian perturbations. To enhance robustness, we extend RID into an ensemble framework that integrates multiple pre-trained text-to-image diffusion models, ensuring resilience against black-box attacks and post-processing techniques, including image compression and purification. Our model is envisioned to play a crucial role in safeguarding portrait rights, thereby preventing illegal and unethical uses.

Real-time Identity Defenses against Malicious Personalization of Diffusion Models

TL;DR

The paper tackles the social and security risks of identity replication by personalized diffusion models and introduces RID, a real-time defender that generates protective perturbations in a single forward pass. RID relies on Adv-SDS and a regularization term, trained on a large multi-person face dataset, to disrupt model personalization while preserving image quality. An ensemble extension (RID-Ensemble) further enhances robustness against black-box attackers and post-processing. With edge-friendly inference speeds and competitive protection, RID offers a practical solution for safeguarding portrait rights in real-world, real-time scenarios.

Abstract

Personalized generative diffusion models, capable of synthesizing highly realistic images based on a few reference portraits, may pose substantial social, ethical, and legal risks via identity replication. Existing defense mechanisms rely on computationally intensive adversarial perturbations tailored to individual images, rendering them impractical for real-world deployment. This study introduces the Real-time Identity Defender (RID), a neural network designed to generate adversarial perturbations through a single forward pass, bypassing the need for image-specific optimization. RID achieves unprecedented efficiency, with defense times as low as 0.12 seconds on a single NVIDIA A100 80G GPU (4,400 times faster than leading methods) and 1.1 seconds per image on a standard Intel i9 CPU, making it suitable for edge devices such as smartphones. Despite its efficiency, RID achieves promising protection performance across visual and quantitative benchmarks, effectively mitigating identity replication risks. Our analysis reveals that RID's perturbations mimic the efficacy of traditional defenses while exhibiting properties distinct from natural noise, such as Gaussian perturbations. To enhance robustness, we extend RID into an ensemble framework that integrates multiple pre-trained text-to-image diffusion models, ensuring resilience against black-box attacks and post-processing techniques, including image compression and purification. Our model is envisioned to play a crucial role in safeguarding portrait rights, thereby preventing illegal and unethical uses.

Paper Structure

This paper contains 18 sections, 19 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Overview of personalized T2I diffusion models and our real-time identity defender (RID). In all panels, the flame icon indicates trainable parameters. a, Personalized T2I models learn personal identity efficiently by fine-tuning a pre-trained TI2 model (e.g., Stable Diffusion sd15sd21podellsdxl) on a few portraits. b, Personalized T2I models can generate high-fidelity images by combining the learned identity and other concepts following the text prompt. c, Existing defense methods optimize an individual adversarial perturbation for each image against Personalized T2I techniques. d, RID trains a defender on a face dataset of multiple persons via adversarial score distillation sampling and regularization (detailed in the Method section). e, RID defends a new testing image by generating the corresponding perturbation through an efficient forward pass. f, RID successfully prevents personalized T2I techniques from learning personal identities in terms of visual perception. All facial images used in this figure are sourced from publicly available datasets and are permitted for academic purposes.
  • Figure 1: Comparison of training pipelines: existing optimization-based methods vs. our real-Time identity defender (RID).a, Optimization-based methods individually optimize perturbations for each image via continuous gradient ascent, resulting in significant computational overhead and prolonged defense time. b, RID’s training framework employs a DiT network that learns to generate image-specific permutations. This process is guided by two key loss functions: (1) adversarial score distillation loss (Adv-SDS) incorporates pre-trained model priors to increase the diffusion robustness of defended images, and (2) regression Loss matches RID-generated perturbations with those from precomputed optimization-based methods for ten percent of data. All facial images used in this figure are sourced from publicly available datasets and are permitted for academic purposes.
  • Figure 2: RID defends the identities against malicious image generation efficiently and effectively under various metrics.a, The RID-defended image closely resembles the clean image (left column). Personalized diffusion models trained on clean images accurately retain identity across three prompts (top row), while models trained on RID-defended images produce distorted outputs with reduced identifiable features (bottom row). b, RID achieves significantly faster defense speeds, with processing times of 1.1 seconds on an Intel I9 CPU and 0.12 seconds on a NVIDIA A100 80G GPU, compared to optimization-based methods such as Anti-DB and AdvDM, which require over 500 seconds on the same GPU. c, Quantitative evaluation across three metrics (FID, ISM, and BRISQUE) shows that RID provides comparable protection to optimization-based methods and performs significantly better than the baseline without any defense. Arrows next to each metric denote better defense performance (i.e., lower visual quality). d, Although the visual patterns of RID-defended samples differ from those of existing methods, RID achieves comparable defense performance qualitatively, consistent with the quantitative results depicted in c. All facial images used in this figure are sourced from publicly available datasets and are permitted for academic purposes.
  • Figure 2: The ablation study of training the RID with different losses.a, We show the defended images for all loss functions. The regularization term effectively mitigates the grid-like artifacts introduced by Adv-SDS in the protected images. b, We show the generated samples from the personalization diffusion models fine-tuned on images defended by the loss functions. The combined loss achieves the best qualitative protection performance—the identity is completely obscured in the generated images. c, Quantitative comparisons of diffusion losses on defended images further confirm that the combined loss consistently delivers the strongest protection. All facial images used in this figure are sourced from publicly available datasets and are permitted for academic purposes.
  • Figure 3: Robustness of RID across perturbation levels, pre-trained diffusion models, and personalization techniques. Under the FID, ISM, and BRISQUE metrics, RID demonstrates effective defense performance compared to the baseline on clean images, across three perturbation levels (6/255, 8/255, and 12/255), popular pre-trained diffusion models (SD v2.1, v1.5, and v1.4), and representative personalization techniques (DB and LoRA+TI). In all panels, the $x$-axis represents the perturbation scale, with $0$ indicating baseline results on clean images. Arrows at the end of each row indicate better defense performance (i.e., lower visual quality) under the corresponding metric.
  • ...and 3 more figures