Table of Contents
Fetching ...

Neighbor-Aware Localized Concept Erasure in Text-to-Image Diffusion Models

Zhuan Shi, Alireza Dehghanpour Farashah, Rik de Vries, Golnoosh Farnadi

Abstract

Concept erasure in text-to-image diffusion models seeks to remove undesired concepts while preserving overall generative capability. Localized erasure methods aim to restrict edits to the spatial region occupied by the target concept. However, we observe that suppressing a concept can unintentionally weaken semantically related neighbor concepts, reducing fidelity in fine-grained domains. We propose Neighbor-Aware Localized Concept Erasure (NLCE), a training-free framework designed to better preserve neighboring concepts while removing target concepts. It operates in three stages: (1) a spectrally-weighted embedding modulation that attenuates target concept directions while stabilizing neighbor concept representations, (2) an attention-guided spatial gate that identifies regions exhibiting residual concept activation, and (3) a spatially-gated hard erasure that eliminates remaining traces only where necessary. This neighbor-aware pipeline enables localized concept removal while maintaining the surrounding concept neighborhood structure. Experiments on fine-grained datasets (Oxford Flowers, Stanford Dogs) show that our method effectively removes target concepts while better preserving closely related categories. Additional results on celebrity identity, explicit content and artistic style demonstrate robustness and generalization to broader erasure scenarios.

Neighbor-Aware Localized Concept Erasure in Text-to-Image Diffusion Models

Abstract

Concept erasure in text-to-image diffusion models seeks to remove undesired concepts while preserving overall generative capability. Localized erasure methods aim to restrict edits to the spatial region occupied by the target concept. However, we observe that suppressing a concept can unintentionally weaken semantically related neighbor concepts, reducing fidelity in fine-grained domains. We propose Neighbor-Aware Localized Concept Erasure (NLCE), a training-free framework designed to better preserve neighboring concepts while removing target concepts. It operates in three stages: (1) a spectrally-weighted embedding modulation that attenuates target concept directions while stabilizing neighbor concept representations, (2) an attention-guided spatial gate that identifies regions exhibiting residual concept activation, and (3) a spatially-gated hard erasure that eliminates remaining traces only where necessary. This neighbor-aware pipeline enables localized concept removal while maintaining the surrounding concept neighborhood structure. Experiments on fine-grained datasets (Oxford Flowers, Stanford Dogs) show that our method effectively removes target concepts while better preserving closely related categories. Additional results on celebrity identity, explicit content and artistic style demonstrate robustness and generalization to broader erasure scenarios.

Paper Structure

This paper contains 41 sections, 13 equations, 19 figures, 11 tables.

Figures (19)

  • Figure 1: Erasure Effectiveness and Neighbor Retention: GLoCE vs. NLCE (Ours)
  • Figure 2: Overview of NLCE. The method proceeds in three stages: (1) representation-space modulation to suppress the target subspace while reinforcing neighbors; (2) attention-guided gating to localize residual concept activations; (3) gated feature clean-up to irreversibly remove remaining traces. For multi-concept erasure, we activate operators per concept based on prompt tokens or embedding similarity.
  • Figure 2: Quantitative comparison of Celebrity Erasure. NLCE provides a balanced trade-off between effective target-identity removal and preservation of remaining celebrities.
  • Figure 3: Qualitative comparison of Oxford Flowers and Stanford Dogs Erasure. Top: Oxford Flowers — NLCE effectively removes the target concept 'Alpine Sea Holly' while preserving neighbor concepts such as 'Rose' and 'Spring Crocus'. Bottom: Stanford Dogs — NLCE successfully eliminates the target concept 'Bluetick' while maintaining visually similar breeds.
  • Figure 4: LPIPS on Non-Target Regions for Each Celebrity.
  • ...and 14 more figures