Table of Contents
Fetching ...

ACE: Concept Editing in Diffusion Models without Performance Degradation

Ruipeng Wang, Junfeng Fang, Jiaqi Li, Hao Chen, Jie Shi, Kun Wang, Xiang Wang

TL;DR

ACE addresses unsafe content and bias in diffusion-based text-to-image generation by introducing a three-step cross null-space projection framework. It constrains parameter perturbations to erase unsafe concepts while preserving normal representations and preventing residual unsafe signals from influencing outputs through cross-attention. The approach yields substantial gains in semantic fidelity and image quality with minimal runtime overhead, outperforming strong baselines across multiple datasets and concept counts. This enables safer, more reliable T2I generation with broad practical applicability.

Abstract

Diffusion-based text-to-image models have demonstrated remarkable capabilities in generating realistic images, but they raise societal and ethical concerns, such as the creation of unsafe content. While concept editing is proposed to address these issues, they often struggle to balance the removal of unsafe concept with maintaining the model's general genera-tive capabilities. In this work, we propose ACE, a new editing method that enhances concept editing in diffusion models. ACE introduces a novel cross null-space projection approach to precisely erase unsafe concept while maintaining the model's ability to generate high-quality, semantically consistent images. Extensive experiments demonstrate that ACE significantly outperforms the advancing baselines,improving semantic consistency by 24.56% and image generation quality by 34.82% on average with only 1% of the time cost. These results highlight the practical utility of concept editing by mitigating its potential risks, paving the way for broader applications in the field. Code is avaliable at https://github.com/littlelittlenine/ACE-zero.git

ACE: Concept Editing in Diffusion Models without Performance Degradation

TL;DR

ACE addresses unsafe content and bias in diffusion-based text-to-image generation by introducing a three-step cross null-space projection framework. It constrains parameter perturbations to erase unsafe concepts while preserving normal representations and preventing residual unsafe signals from influencing outputs through cross-attention. The approach yields substantial gains in semantic fidelity and image quality with minimal runtime overhead, outperforming strong baselines across multiple datasets and concept counts. This enables safer, more reliable T2I generation with broad practical applicability.

Abstract

Diffusion-based text-to-image models have demonstrated remarkable capabilities in generating realistic images, but they raise societal and ethical concerns, such as the creation of unsafe content. While concept editing is proposed to address these issues, they often struggle to balance the removal of unsafe concept with maintaining the model's general genera-tive capabilities. In this work, we propose ACE, a new editing method that enhances concept editing in diffusion models. ACE introduces a novel cross null-space projection approach to precisely erase unsafe concept while maintaining the model's ability to generate high-quality, semantically consistent images. Extensive experiments demonstrate that ACE significantly outperforms the advancing baselines,improving semantic consistency by 24.56% and image generation quality by 34.82% on average with only 1% of the time cost. These results highlight the practical utility of concept editing by mitigating its potential risks, paving the way for broader applications in the field. Code is avaliable at https://github.com/littlelittlenine/ACE-zero.git

Paper Structure

This paper contains 20 sections, 15 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Images generated by the original and edited Stable Diffusion (SD) v2.1. Red text denotes unsafe concepts, while blue text in input prompts indicates concepts are prone to being overlooked by diffusion models edited using baseline methods. The scores represent LPIPS scores ($\downarrow$) UCE, which quantify the discrepancy between the generated images and the images after editing. Detailed implementation is exhibited in Section 5.
  • Figure 2: Comparison of current concept editing methods (a) and our ACE (b). Best viewed in color.
  • Figure 3: Performance of diffusion models after edited by the baseline methods w.r.t, various metrics such as CLIP ($\uparrow$), FID ($\downarrow$), LPIPSc ($\downarrow$) and LPIPS ($\downarrow$). Specifically, LPIPSc measures the similarity between images generated by the edited model and the original generated images. Best viewed in color.
  • Figure 4: Case study on the generation of images by diffusion models after the erasure of copyright infringement using different methods. Best viewed in colour.
  • Figure 5: Case study on the generation of images by diffusion models after the erasure of nude concepts using different editing methods. Best viewed in colour.
  • ...and 2 more figures