StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations

Yanjie Li; Wenxuan Zhang; Xinqi Lyu; Yihao Liu; Bin Xiao

StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations

Yanjie Li, Wenxuan Zhang, Xinqi Lyu, Yihao Liu, Bin Xiao

TL;DR

StyleGuard tackles unauthorized style mimicry in text-to-image diffusion by perturbing latent-style features and introducing an upscale loss to counter diffusion-based purifications. It formulates a bilevel objective and uses an alternating, approximate optimization with ensembles of encoders and purifiers to achieve model-agnostic transferability. The method demonstrates strong defense performance against DreamBooth and Textual Inversion on WikiArt and CelebA, including robustness to DiffPure and Noise Upscaling, and shows favorable transferability across models and fine-tuning schemes. This approach has practical implications for protecting artists’ intellectual property in real-world diffusion-based artwork generation systems.

Abstract

Recently, text-to-image diffusion models have been widely used for style mimicry and personalized customization through methods such as DreamBooth and Textual Inversion. This has raised concerns about intellectual property protection and the generation of deceptive content. Recent studies, such as Glaze and Anti-DreamBooth, have proposed using adversarial noise to protect images from these attacks. However, recent purification-based methods, such as DiffPure and Noise Upscaling, have successfully attacked these latest defenses, showing the vulnerabilities of these methods. Moreover, present methods show limited transferability across models, making them less effective against unknown text-to-image models. To address these issues, we propose a novel anti-mimicry method, StyleGuard. We propose a novel style loss that optimizes the style-related features in the latent space to make it deviate from the original image, which improves model-agnostic transferability. Additionally, to enhance the perturbation's ability to bypass diffusion-based purification, we designed a novel upscale loss that involves ensemble purifiers and upscalers during training. Extensive experiments on the WikiArt and CelebA datasets demonstrate that StyleGuard outperforms existing methods in robustness against various transformations and purifications, effectively countering style mimicry in various models. Moreover, StyleGuard is effective on different style mimicry methods, including DreamBooth and Textual Inversion. The code is available at https://github.com/PolyLiYJ/StyleGuard.

StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations

TL;DR

Abstract

StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)