Table of Contents
Fetching ...

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

Kai Huang, Haoming Wang, Wei Gao

TL;DR

The approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations.

Abstract

Text-to-image diffusion models can be fine-tuned in custom domains to adapt to specific user preferences, but such adaptability has also been utilized for illegal purposes, such as forging public figures' portraits, duplicating copyrighted artworks and generating explicit contents. Existing work focused on detecting the illegally generated contents, but cannot prevent or mitigate illegal adaptations of diffusion models. Other schemes of model unlearning and reinitialization, similarly, cannot prevent users from relearning the knowledge of illegal model adaptation with custom data. In this paper, we present FreezeAsGuard, a new technique that addresses these limitations and enables irreversible mitigation of illegal adaptations of diffusion models. Our approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations. Experiment results in multiple text-to-image application domains show that FreezeAsGuard provides 37% stronger power in mitigating illegal model adaptations compared to competitive baselines, while incurring less than 5% impact on legal model adaptations. The source code is available at: https://github.com/pittisl/FreezeAsGuard.

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

TL;DR

The approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations.

Abstract

Text-to-image diffusion models can be fine-tuned in custom domains to adapt to specific user preferences, but such adaptability has also been utilized for illegal purposes, such as forging public figures' portraits, duplicating copyrighted artworks and generating explicit contents. Existing work focused on detecting the illegally generated contents, but cannot prevent or mitigate illegal adaptations of diffusion models. Other schemes of model unlearning and reinitialization, similarly, cannot prevent users from relearning the knowledge of illegal model adaptation with custom data. In this paper, we present FreezeAsGuard, a new technique that addresses these limitations and enables irreversible mitigation of illegal adaptations of diffusion models. Our approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations. Experiment results in multiple text-to-image application domains show that FreezeAsGuard provides 37% stronger power in mitigating illegal model adaptations compared to competitive baselines, while incurring less than 5% impact on legal model adaptations. The source code is available at: https://github.com/pittisl/FreezeAsGuard.
Paper Structure (33 sections, 10 equations, 24 figures, 8 tables, 1 algorithm)

This paper contains 33 sections, 10 equations, 24 figures, 8 tables, 1 algorithm.

Figures (24)

  • Figure 1: Existing work vs. FreezeAsGuard in mitigating malicious adaptation of diffusion models
  • Figure 2: FreezeAsGuard ensures that portraits (left) and artworks (right) generated by diffusion models in illegal classes cannot be recognizable as target objects, even if the model has been fine-tuned with data samples in illegal classes. In contrast, unlearning schemes (UCE [gandikota2024unified] and IMMA [zheng2023imma]) cannot prevent the unlearned knowledge of illegal classes from being relearned in fine-tuning.
  • Figure 3: Mask learning and fine-tuning as a bilevel optimization
  • Figure 4: Generated images with different model components being frozen, with prompt "a pikachu with a pink dress and a pink bow"
  • Figure 5: Overview of FreezeAsGuard design
  • ...and 19 more figures