FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

Kai Huang; Haoming Wang; Wei Gao

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

Kai Huang, Haoming Wang, Wei Gao

TL;DR

The approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations.

Abstract

Text-to-image diffusion models can be fine-tuned in custom domains to adapt to specific user preferences, but such adaptability has also been utilized for illegal purposes, such as forging public figures' portraits, duplicating copyrighted artworks and generating explicit contents. Existing work focused on detecting the illegally generated contents, but cannot prevent or mitigate illegal adaptations of diffusion models. Other schemes of model unlearning and reinitialization, similarly, cannot prevent users from relearning the knowledge of illegal model adaptation with custom data. In this paper, we present FreezeAsGuard, a new technique that addresses these limitations and enables irreversible mitigation of illegal adaptations of diffusion models. Our approach is that the model publisher selectively freezes tensors in pre-trained diffusion models that are critical to illegal model adaptations, to mitigate the fine-tuned model's representation power in illegal adaptations, but minimize the impact on other legal adaptations. Experiment results in multiple text-to-image application domains show that FreezeAsGuard provides 37% stronger power in mitigating illegal model adaptations compared to competitive baselines, while incurring less than 5% impact on legal model adaptations. The source code is available at: https://github.com/pittisl/FreezeAsGuard.

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

TL;DR

Abstract

Paper Structure (33 sections, 10 equations, 24 figures, 8 tables, 1 algorithm)

This paper contains 33 sections, 10 equations, 24 figures, 8 tables, 1 algorithm.

Introduction
Background & Motivation
Fine-Tuning Diffusion Models
Partial Model Fine-tuning
Method
Mask Learning in the Upper-level Loop
Model Fine-tuning in the Lower-level Loop
Towards Efficient Bilevel Optimization
Experiments
Mitigating Forgery of Public Figures' Portraits
Mitigating Duplication of Copyright Artworks
Mitigating Generation of Explicit Contents
Scalability of Mitigation Power
The Learned Selection of Frozen Tensors
Mitigation Power with Different Models
...and 18 more sections

Figures (24)

Figure 1: Existing work vs. FreezeAsGuard in mitigating malicious adaptation of diffusion models
Figure 2: FreezeAsGuard ensures that portraits (left) and artworks (right) generated by diffusion models in illegal classes cannot be recognizable as target objects, even if the model has been fine-tuned with data samples in illegal classes. In contrast, unlearning schemes (UCE [gandikota2024unified] and IMMA [zheng2023imma]) cannot prevent the unlearned knowledge of illegal classes from being relearned in fine-tuning.
Figure 3: Mask learning and fine-tuning as a bilevel optimization
Figure 4: Generated images with different model components being frozen, with prompt "a pikachu with a pink dress and a pink bow"
Figure 5: Overview of FreezeAsGuard design
...and 19 more figures

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

TL;DR

Abstract

FreezeAsGuard: Mitigating Illegal Adaptation of Diffusion Models via Selective Tensor Freezing

Authors

TL;DR

Abstract

Table of Contents

Figures (24)