GrOCE:Graph-Guided Online Concept Erasure for Text-to-Image Diffusion Models
Ning Han, Zhenyu Ge, Feng Han, Yuhua Sun, Chengqing Li, Jingjing Chen
TL;DR
GrOCE tackles the challenge of precise, adaptable concept erasure in text-to-image diffusion models by reframing erasure as graph-guided inference in a dynamic semantic space. It introduces three components—Dynamic Topological Graph Construction, Adaptive Cluster Identification, and Selective Edge Severing—to identify and remove entire concept clusters without retraining. The method achieves state-of-the-art results on $CS$ and $FID$ across single, multi-target, and art-style erasure tasks, with real-time performance on a single NVIDIA $A100$ GPU. This graph-based, training-free approach provides interpretable and scalable safety tooling for evolving content risks and copyright concerns in diffusion-based generation.
Abstract
Concept erasure aims to remove harmful, inappropriate, or copyrighted content from text-to-image diffusion models while preserving non-target semantics. However, existing methods either rely on costly fine-tuning or apply coarse semantic separation, often degrading unrelated concepts and lacking adaptability to evolving concept sets. To alleviate this issue, we propose Graph-Guided Online Concept Erasure (GrOCE), a training-free framework that performs precise and adaptive concept removal through graph-based semantic reasoning. GrOCE models concepts and their interrelations as a dynamic semantic graph, enabling principled reasoning over dependencies and fine-grained isolation of undesired content. It comprises three components: (1) Dynamic Topological Graph Construction for incremental graph building, (2) Adaptive Cluster Identification for multi-hop traversal with similarity-decay scoring, and (3) Selective Edge Severing for targeted edge removal while preserving global semantics. Extensive experiments demonstrate that GrOCE achieves state-of-the-art performance on Concept Similarity (CS) and Fréchet Inception Distance (FID) metrics, offering efficient, accurate, and stable concept erasure without retraining.
