Table of Contents
Fetching ...

StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations

Yanjie Li, Wenxuan Zhang, Xinqi Lyu, Yihao Liu, Bin Xiao

TL;DR

StyleGuard tackles unauthorized style mimicry in text-to-image diffusion by perturbing latent-style features and introducing an upscale loss to counter diffusion-based purifications. It formulates a bilevel objective and uses an alternating, approximate optimization with ensembles of encoders and purifiers to achieve model-agnostic transferability. The method demonstrates strong defense performance against DreamBooth and Textual Inversion on WikiArt and CelebA, including robustness to DiffPure and Noise Upscaling, and shows favorable transferability across models and fine-tuning schemes. This approach has practical implications for protecting artists’ intellectual property in real-world diffusion-based artwork generation systems.

Abstract

Recently, text-to-image diffusion models have been widely used for style mimicry and personalized customization through methods such as DreamBooth and Textual Inversion. This has raised concerns about intellectual property protection and the generation of deceptive content. Recent studies, such as Glaze and Anti-DreamBooth, have proposed using adversarial noise to protect images from these attacks. However, recent purification-based methods, such as DiffPure and Noise Upscaling, have successfully attacked these latest defenses, showing the vulnerabilities of these methods. Moreover, present methods show limited transferability across models, making them less effective against unknown text-to-image models. To address these issues, we propose a novel anti-mimicry method, StyleGuard. We propose a novel style loss that optimizes the style-related features in the latent space to make it deviate from the original image, which improves model-agnostic transferability. Additionally, to enhance the perturbation's ability to bypass diffusion-based purification, we designed a novel upscale loss that involves ensemble purifiers and upscalers during training. Extensive experiments on the WikiArt and CelebA datasets demonstrate that StyleGuard outperforms existing methods in robustness against various transformations and purifications, effectively countering style mimicry in various models. Moreover, StyleGuard is effective on different style mimicry methods, including DreamBooth and Textual Inversion. The code is available at https://github.com/PolyLiYJ/StyleGuard.

StyleGuard: Preventing Text-to-Image-Model-based Style Mimicry Attacks by Style Perturbations

TL;DR

StyleGuard tackles unauthorized style mimicry in text-to-image diffusion by perturbing latent-style features and introducing an upscale loss to counter diffusion-based purifications. It formulates a bilevel objective and uses an alternating, approximate optimization with ensembles of encoders and purifiers to achieve model-agnostic transferability. The method demonstrates strong defense performance against DreamBooth and Textual Inversion on WikiArt and CelebA, including robustness to DiffPure and Noise Upscaling, and shows favorable transferability across models and fine-tuning schemes. This approach has practical implications for protecting artists’ intellectual property in real-world diffusion-based artwork generation systems.

Abstract

Recently, text-to-image diffusion models have been widely used for style mimicry and personalized customization through methods such as DreamBooth and Textual Inversion. This has raised concerns about intellectual property protection and the generation of deceptive content. Recent studies, such as Glaze and Anti-DreamBooth, have proposed using adversarial noise to protect images from these attacks. However, recent purification-based methods, such as DiffPure and Noise Upscaling, have successfully attacked these latest defenses, showing the vulnerabilities of these methods. Moreover, present methods show limited transferability across models, making them less effective against unknown text-to-image models. To address these issues, we propose a novel anti-mimicry method, StyleGuard. We propose a novel style loss that optimizes the style-related features in the latent space to make it deviate from the original image, which improves model-agnostic transferability. Additionally, to enhance the perturbation's ability to bypass diffusion-based purification, we designed a novel upscale loss that involves ensemble purifiers and upscalers during training. Extensive experiments on the WikiArt and CelebA datasets demonstrate that StyleGuard outperforms existing methods in robustness against various transformations and purifications, effectively countering style mimicry in various models. Moreover, StyleGuard is effective on different style mimicry methods, including DreamBooth and Textual Inversion. The code is available at https://github.com/PolyLiYJ/StyleGuard.

Paper Structure

This paper contains 42 sections, 9 equations, 11 figures, 5 tables, 1 algorithm.

Figures (11)

  • Figure 1: A comparison of the defensive performance of different methods in the presence of the purification transformations. Previous methods, including Mist, AntiDreamBooth, and MetaCloak, fail to defend against DiffPure and Noise Upscale, while our proposed method successfully resists the style mimicry attack under various transformations for different customization methods.
  • Figure 2: The pipeline of StyleGuard. We alternatively update the diffusion model and the protected images. Ensemble image encoders and purifiers are included to compute the style loss and upscale loss to improve cross-model transferability and the robustness to purifications.
  • Figure 3: Visualizing the effects of different loss functions. It is shown that only using denoise loss and style loss cannot defend the Noise Upscale well, as shown in (3). With the upscaler loss, the image quality significantly decreases even with Noise Upscale, as shown in (4).
  • Figure 4: Evaluation results of mimicry success rates by human evaluators. We asked users to compare generated images based on clean and protected training images using the question: "Based on the image style and quality, which image better fits the reference samples?" A lower mimicry success rate indicates stronger perturbation noises affecting the image quality.
  • Figure 5: Visualization of Glaze and Mist. For Glaze, we use paintings from Van Gogh as the reference and paintings from Rembrandt as the targets. The results indicate that Glaze produces images that are a mixture of the target and reference styles. For Mist, we use a periodic image as the target, according to the original paper.
  • ...and 6 more figures