ACCORD: Alleviating Concept Coupling through Dependence Regularization for Text-to-Image Diffusion Personalization
Shizhan Liu, Hao Zheng, Hang Yu, Jianguo Li
TL;DR
The paper tackles concept coupling in text-to-image personalization by formulating it as unintended dependencies between a personalization target and other concepts. It introduces ACCORD, which splits the problem into two computable dependence discrepancies and provides two plug-and-play losses, Denoising Decouple Loss and Prior Decouple Loss, to minimize them without extra data. The approach is validated across subject, style, and zero-shot face personalization, showing improved text control and personalization fidelity over strong baselines and robust ablations. Its plug-and-play nature and theoretical grounding offer a practical path to more faithful personalized generation with diffusion models.
Abstract
Image personalization has garnered attention for its ability to customize Text-to-Image generation using only a few reference images. However, a key challenge in image personalization is the issue of conceptual coupling, where the limited number of reference images leads the model to form unwanted associations between the personalization target and other concepts. Current methods attempt to tackle this issue indirectly, leading to a suboptimal balance between text control and personalization fidelity. In this paper, we take a direct approach to the concept coupling problem through statistical analysis, revealing that it stems from two distinct sources of dependence discrepancies. We therefore propose two complementary plug-and-play loss functions: Denoising Decouple Loss and Prior Decouple loss, each designed to minimize one type of dependence discrepancy. Extensive experiments demonstrate that our approach achieves a superior trade-off between text control and personalization fidelity.
