ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation
Zirun Guo, Tao Jin
TL;DR
The paper tackles continual diffusion-based text-to-image customization, where sequentially arriving concepts cause forgetting and concept confusion. It introduces ConceptGuard, a framework that integrates shift embeddings, concept-binding prompts, memory preservation regularization, and a priority queue to adaptively manage past concepts during continual learning. Empirical results show consistent improvements over baselines in both single- and multi-concept generation, with higher alignment scores and lower forgetting metrics, and ablations highlight the value of each component, especially concept-binding prompts. This work enables more robust, scalable, and practical continual personalization for diffusion models in real-world applications.
Abstract
Diffusion customization methods have achieved impressive results with only a minimal number of user-provided images. However, existing approaches customize concepts collectively, whereas real-world applications often require sequential concept integration. This sequential nature can lead to catastrophic forgetting, where previously learned concepts are lost. In this paper, we investigate concept forgetting and concept confusion in the continual customization. To tackle these challenges, we present ConceptGuard, a comprehensive approach that combines shift embedding, concept-binding prompts and memory preservation regularization, supplemented by a priority queue which can adaptively update the importance and occurrence order of different concepts. These strategies can dynamically update, unbind and learn the relationship of the previous concepts, thus alleviating concept forgetting and confusion. Through comprehensive experiments, we show that our approach outperforms all the baseline methods consistently and significantly in both quantitative and qualitative analyses.
