DLADiff: A Dual-Layer Defense Framework against Fine-Tuning and Zero-Shot Customization of Diffusion Models
Jun Jia, Hongyi Miao, Yingjie Zhou, Linhan Cao, Yanwei Jiang, Wangqiu Zhou, Dandan Zhu, Hua Yang, Wei Sun, Xiongkuo Min, Guangtao Zhai
TL;DR
This work addresses the privacy risks posed by diffusion-model customization, where unauthorized users can customize models via fine-tuning or zero-shot methods using only a few images. It proposes DLADiff, a dual-layer defense that combines a first-layer protection based on Dual-Surrogate Models (DSUR) and Alternating Dynamic Fine-Tuning (ADFT) to thwart fine-tuning, with a second-layer perturbation that leverages multiple identity encoders and PGD to resist zero-shot generation. Empirical results show DLADiff outperforms state-of-the-art defenses in both defense scenarios, achieving lower quality outputs for unauthorized models and significantly reducing identity leakage across diverse encoders and adapters. The framework offers a practical, generalizable approach to protecting personal portraits in diffusion-model applications, with potential impact on privacy-preserving image synthesis and responsible deployment of generative AI.
Abstract
With the rapid advancement of diffusion models, a variety of fine-tuning methods have been developed, enabling high-fidelity image generation with high similarity to the target content using only 3 to 5 training images. More recently, zero-shot generation methods have emerged, capable of producing highly realistic outputs from a single reference image without altering model weights. However, technological advancements have also introduced significant risks to facial privacy. Malicious actors can exploit diffusion model customization with just a few or even one image of a person to create synthetic identities nearly identical to the original identity. Although research has begun to focus on defending against diffusion model customization, most existing defense methods target fine-tuning approaches and neglect zero-shot generation defenses. To address this issue, this paper proposes Dual-Layer Anti-Diffusion (DLADiff) to defense both fine-tuning methods and zero-shot methods. DLADiff contains a dual-layer protective mechanism. The first layer provides effective protection against unauthorized fine-tuning by leveraging the proposed Dual-Surrogate Models (DSUR) mechanism and Alternating Dynamic Fine-Tuning (ADFT), which integrates adversarial training with the prior knowledge derived from pre-fine-tuned models. The second layer, though simple in design, demonstrates strong effectiveness in preventing image generation through zero-shot methods. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in defending against fine-tuning of diffusion models and achieves unprecedented performance in protecting against zero-shot generation.
