Implicit Priors Editing in Stable Diffusion via Targeted Token Adjustment
Feng He, Chao Zhang, Zhixue Zhao
TL;DR
The paper addresses implicit priors in text-to-image prompts by introducing EmbEdit, a method that updates only the Word Token Embedding (WTE) of a target token to shift priors while leaving unrelated concepts and all model weights intact. By minimizing the distance between the last hidden states of the CLIP text encoder for the original and modified prompts, EmbEdit achieves precise, side-effect-free edits with extremely low parameter cost (768 dimensions for SD1.4, 2048 for SD XL) and supports sequential editing without model collapse. Probing shows WTEs encode perceptual priors (e.g., color), justifying targeted WTE edits, and empirical results demonstrate state-of-the-art performance on object editing and gender-bias mitigation across model scales, with robust generalization and a clear degradation only in complex multi-word or rare-concept cases. The approach offers practical, reversible edits with substantial efficiency advantages over prior methods, enabling rapid, controlled updates to implicit knowledge embedded in diffusion-based T2I systems. Ethical considerations emphasize responsible use and transparency, including public code and supplementary materials to aid reproducibility.
Abstract
Implicit assumptions and priors are often necessary in text-to-image generation tasks, especially when textual prompts lack sufficient context. However, these assumptions can sometimes reflect outdated concepts, inaccuracies, or societal bias embedded in the training data. We present Embedding-only Editing (Embedit), a method designed to efficiently adjust implict assumptions and priors in the model without affecting its interpretation of unrelated objects or overall performance. Given a "source" prompt (e.g., "rose") that elicits an implicit assumption (e.g., rose is red) and a "destination" prompt that specifies the desired attribute (e.g., "blue rose"), Embedit fine-tunes only the word token embedding (WTE) of the target object ("rose") to optimize the last hidden state of text encoder in Stable Diffusion, a SOTA text-to-image model. This targeted adjustment prevents unintended effects on other objects in the model's knowledge base, as the WTEs for unrelated objects and the model weights remain unchanged. Consequently, when a prompt does not contain the edited object, all representations, and the model outputs are identical to those of the original, unedited model. Our method is highly efficient, modifying only 768 parameters for Stable Diffusion 1.4 and 2048 for XL in a single edit, matching the WTE dimension of each respective model. This minimal scope, combined with rapid execution, makes Embedit highly practical for real-world applications. Additionally, changes are easily reversible by restoring the original WTE layers. Our experimental results demonstrate that Embedit consistently outperforms previous methods across various models, tasks, and editing scenarios (both single and sequential multiple edits), achieving at least a 6.01% improvement (from 87.17% to 93.18%).
