Table of Contents
Fetching ...

Concept Unlearning by Modeling Key Steps of Diffusion Process

Chaoshuo Zhang, Chenhao Lin, Zhengyu Zhao, Le Yang, Qian Wang, Chao Shen

TL;DR

This work tackles the safety challenge of text-to-image diffusion models by enabling effective forgetting of target concepts while preserving generation quality. It introduces Key Step Concept Unlearning (KSCU), which selectively fine-tunes only a subset of late, high-impact denoising steps, guided by a Key Step Table, Prompt Augmentation, and a Key Step Unlearning Optimization tailored for Classifier-Free Guidance models; an acceleration strategy further speeds training. Empirical results across class, style, nudity, and instance unlearning show that KSCU consistently outperforms state-of-the-art baselines in unlearning effectiveness and generative retainability, achieving high UA/IRA/CRA and favorable FID, even under adversarial prompts. The paper also demonstrates targeted substitutions (c^+ to c^−) to control the replacement concept, increasing the practical utility for safe diffusion deployment. Overall, the key-step perspective provides a principled, efficient framework for robust concept erasure in large-scale diffusion models with real-world safety implications.

Abstract

Text-to-image diffusion models (T2I DMs), represented by Stable Diffusion, which generate highly realistic images based on textual input, have been widely used, but their flexibility also makes them prone to misuse for producing harmful or unsafe content. Concept unlearning has been used to prevent text-to-image diffusion models from being misused to generate undesirable visual content. However, existing methods struggle to trade off unlearning effectiveness with the preservation of generation quality. To address this limitation, we propose Key Step Concept Unlearning (KSCU), which selectively fine-tunes the model at key steps to the target concept. KSCU is inspired by the fact that different diffusion denoising steps contribute unequally to the final generation. Compared to previous approaches, which treat all denoising steps uniformly, KSCU avoids over-optimization of unnecessary steps for higher effectiveness and reduces the number of parameter updates for higher efficiency. For example, on the I2P dataset, KSCU outperforms ESD by 8.3% in nudity unlearning accuracy while improving FID by 8.4%, and achieves a high overall score of 0.92, substantially surpassing all other SOTA methods.

Concept Unlearning by Modeling Key Steps of Diffusion Process

TL;DR

This work tackles the safety challenge of text-to-image diffusion models by enabling effective forgetting of target concepts while preserving generation quality. It introduces Key Step Concept Unlearning (KSCU), which selectively fine-tunes only a subset of late, high-impact denoising steps, guided by a Key Step Table, Prompt Augmentation, and a Key Step Unlearning Optimization tailored for Classifier-Free Guidance models; an acceleration strategy further speeds training. Empirical results across class, style, nudity, and instance unlearning show that KSCU consistently outperforms state-of-the-art baselines in unlearning effectiveness and generative retainability, achieving high UA/IRA/CRA and favorable FID, even under adversarial prompts. The paper also demonstrates targeted substitutions (c^+ to c^−) to control the replacement concept, increasing the practical utility for safe diffusion deployment. Overall, the key-step perspective provides a principled, efficient framework for robust concept erasure in large-scale diffusion models with real-world safety implications.

Abstract

Text-to-image diffusion models (T2I DMs), represented by Stable Diffusion, which generate highly realistic images based on textual input, have been widely used, but their flexibility also makes them prone to misuse for producing harmful or unsafe content. Concept unlearning has been used to prevent text-to-image diffusion models from being misused to generate undesirable visual content. However, existing methods struggle to trade off unlearning effectiveness with the preservation of generation quality. To address this limitation, we propose Key Step Concept Unlearning (KSCU), which selectively fine-tunes the model at key steps to the target concept. KSCU is inspired by the fact that different diffusion denoising steps contribute unequally to the final generation. Compared to previous approaches, which treat all denoising steps uniformly, KSCU avoids over-optimization of unnecessary steps for higher effectiveness and reduces the number of parameter updates for higher efficiency. For example, on the I2P dataset, KSCU outperforms ESD by 8.3% in nudity unlearning accuracy while improving FID by 8.4%, and achieves a high overall score of 0.92, substantially surpassing all other SOTA methods.

Paper Structure

This paper contains 27 sections, 19 equations, 8 figures, 4 tables, 2 algorithms.

Figures (8)

  • Figure 1: Compared to previous methods (b), KSCU (c) focuses exclusively on the denoising steps that have the most significant impact on concept generation and achieves effective forgetting of the target concept while better preserving the generation of unrelated concepts.
  • Figure 2: The KSCU framework consists of three key modules: Key Step Table, Prompt Augmentation, and Key Step Unlearning Optimization. During training, a denoising step $t$ is selected from the Key Step Table. The unlearn model performs denoising sampling to obtain $z_t$, while the Prompt Augmentation module converts the target concept $c$ into an augmented prompt $c^*$. Using $z_t$, $t$, $c$, and $c^*$, the Key Step Unlearning Optimization is computed via Equation \ref{['eq7']}, and designated parameters of the unlearn model are updated through backpropagation.
  • Figure 3: Correlation between text and attention map at different steps
  • Figure 4: Image generation with Stable Diffusion v1-4 using the prompt "naked woman": evolution of nude scores and qualitative results across the denoising process.
  • Figure 5: KSCU's Qualitative Performance on Unlearn Canvas. Red-bordered images show the generations of the unlearned concept after KSCU, while borderless images depict the results of unrelated concepts. More qualitative results and controlled replacement experiments are provided in Appendix A.
  • ...and 3 more figures