CASteer: Steering Diffusion Models for Controllable Generation
Tatiana Gaintseva, Andreea-Maria Oncescu, Chengcheng Ma, Ziquan Liu, Martin Benning, Gregory Slabaugh, Jiankang Deng, Ismail Elezi
TL;DR
This work introduces CASteer, a training-free framework for controllable concept erasure in diffusion models. It builds concept-specific steering vectors from paired positive/negative prompts and applies them to cross-attention outputs during inference, enabling selective suppression of unwanted concepts while preserving overall image quality. By targeting CA layers and leveraging a projection-based suppression scheme, CASteer achieves state-of-the-art erasure across abstract and concrete concepts and across diverse backbones (SD-1.4, SDXL, SANA) and distilled variants, with extensions to multiple concepts and implicit prompts. The approach offers a practical, scalable tool for safety and content control in generative imaging, with clear pathways for integration and future theoretical grounding.
Abstract
Diffusion models have transformed image generation, yet controlling their outputs to reliably erase undesired concepts remains challenging. Existing approaches usually require task-specific training and struggle to generalize across both concrete (e.g., objects) and abstract (e.g., styles) concepts. We propose CASteer (Cross-Attention Steering), a training-free framework for concept erasure in diffusion models using steering vectors to influence hidden representations dynamically. CASteer precomputes concept-specific steering vectors by averaging neural activations from images generated for each target concept. During inference, it dynamically applies these vectors to suppress undesired concepts only when they appear, ensuring that unrelated regions remain unaffected. This selective activation enables precise, context-aware erasure without degrading overall image quality. This approach achieves effective removal of harmful or unwanted content across a wide range of visual concepts, all without model retraining. CASteer outperforms state-of-the-art concept erasure techniques while preserving unrelated content and minimizing unintended effects. Pseudocode is provided in the supplementary.
