MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

Yixin Liu; Chenrui Fan; Yutong Dai; Xun Chen; Pan Zhou; Lichao Sun

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

Yixin Liu, Chenrui Fan, Yutong Dai, Xun Chen, Pan Zhou, Lichao Sun

TL;DR

MetaCloak tackles the risk of unauthorized subject-driven diffusion synthesis by introducing a robust poisoning framework trained via meta-learning over a pool of surrogates, enhancing transferability across models and resilience to data transformations. It advances a transformation-robust denoising-maximization objective, using Expectation over Transformations to produce perturbations that induce semantic distortion in personalized DreamBooth generation. Empirical results on VGGFace2 and CelebA-HQ demonstrate superior protection against standalone and online DreamBooth training, including Replicate, with notable gains in SDS and IMS metrics and partial robustness to purification defenses. The approach offers a practical, scalable mechanism to protect personal imagery from misuse in diffusion-based personalization, with potential for further refinements in stealth and efficiency.

Abstract

Text-to-image diffusion models allow seamless generation of personalized images from scant reference photos. Yet, these tools, in the wrong hands, can fabricate misleading or harmful content, endangering individuals. To address this problem, existing poisoning-based approaches perturb user images in an imperceptible way to render them "unlearnable" from malicious uses. We identify two limitations of these defending approaches: i) sub-optimal due to the hand-crafted heuristics for solving the intractable bilevel optimization and ii) lack of robustness against simple data transformations like Gaussian filtering. To solve these challenges, we propose MetaCloak, which solves the bi-level poisoning problem with a meta-learning framework with an additional transformation sampling process to craft transferable and robust perturbation. Specifically, we employ a pool of surrogate diffusion models to craft transferable and model-agnostic perturbation. Furthermore, by incorporating an additional transformation process, we design a simple denoising-error maximization loss that is sufficient for causing transformation-robust semantic distortion and degradation in a personalized generation. Extensive experiments on the VGGFace2 and CelebA-HQ datasets show that MetaCloak outperforms existing approaches. Notably, MetaCloak can successfully fool online training services like Replicate, in a black-box manner, demonstrating the effectiveness of MetaCloak in real-world scenarios. Our code is available at https://github.com/liuyixin-louis/MetaCloak.

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

TL;DR

Abstract

Paper Structure (26 sections, 10 equations, 16 figures, 9 tables, 1 algorithm)

This paper contains 26 sections, 10 equations, 16 figures, 9 tables, 1 algorithm.

Introduction
Related Wroks
Preliminary
Problem Statement
Method
Learning to Learn Transferable and Model-agnostic Perturbation
Transformation-robust Semantic Distortion with Denoising-error Maximization
Experiments
Setup
Effectiveness of MetaCloak
Resistance against Adversarial Purification
Conlusion
Experiment Details
Hardware and DreamBooth Training Details
Baseline Methods and Metrics
...and 11 more sections

Figures (16)

Figure 1: Image protected by existing methods fails to fool personalized text-to-image approaches after applying data transformations. In contrast, our MetaCloak is still robust in such adversity.
Figure 2: Visualization of transformation robustness of different methods. The first row is a generated sample from DreamBooth trained on poisons with no transformation defenses. The 2-th row showcases the robustness of each method under transformation with a Gaussian kernel size of 7. Our method performs robustly under transformation defenses, while other methods fail to preserve the perturbation.
Figure 3: Results under online training-as-service settings with the Full and LoRA DreamBooth fine-tuning settings on the Replicate.
Figure 4: Visualization of the generated image of Dreambooth trained on a protected instance from Celeba-HQ with different diffusion models, including SD v1-5, SD v2-1-base, and SD v2-1.
Figure 5: Results of MetaCloak with different protection radii and ratios. The dashed line marks the previous SoTA results under $r=32/255$ and the protection ratio of 100%, respectively.
...and 11 more figures

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

TL;DR

Abstract

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

Authors

TL;DR

Abstract

Table of Contents

Figures (16)