PureCC: Pure Learning for Text-to-Image Concept Customization

Zhichao Liao; Xiaole Xian; Qingyu Li; Wenyu Qin; Meng Wang; Weicheng Xie; Siyang Song; Pingfa Feng; Long Zeng; Liang Pan

PureCC: Pure Learning for Text-to-Image Concept Customization

Zhichao Liao, Xiaole Xian, Qingyu Li, Wenyu Qin, Meng Wang, Weicheng Xie, Siyang Song, Pingfa Feng, Long Zeng, Liang Pan

TL;DR

PureCC introduces a novel decoupled learning objective for concept customization, which combines the implicit guidance of the target concept with the original conditional prediction, and introduces a novel adaptive guidance scale $\lambda^\star$ to dynamically adjust the guidance strength of the target concept, balancing customization fidelity and model preservation.

Abstract

Existing concept customization methods have achieved remarkable outcomes in high-fidelity and multi-concept customization. However, they often neglect the influence on the original model's behavior and capabilities when learning new personalized concepts. To address this issue, we propose PureCC. PureCC introduces a novel decoupled learning objective for concept customization, which combines the implicit guidance of the target concept with the original conditional prediction. This separated form enables PureCC to substantially focus on the original model during training. Moreover, based on this objective, PureCC designs a dual-branch training pipeline that includes a frozen extractor providing purified target concept representations as implicit guidance and a trainable flow model producing the original conditional prediction, jointly achieving pure learning for personalized concepts. Furthermore, PureCC introduces a novel adaptive guidance scale $λ^\star$ to dynamically adjust the guidance strength of the target concept, balancing customization fidelity and model preservation. Extensive experiments show that PureCC achieves state-of-the-art performance in preserving the original behavior and capabilities while enabling high-fidelity concept customization. The code is available at https://github.com/lzc-sg/PureCC.

PureCC: Pure Learning for Text-to-Image Concept Customization

TL;DR

to dynamically adjust the guidance strength of the target concept, balancing customization fidelity and model preservation.

Abstract

to dynamically adjust the guidance strength of the target concept, balancing customization fidelity and model preservation. Extensive experiments show that PureCC achieves state-of-the-art performance in preserving the original behavior and capabilities while enabling high-fidelity concept customization. The code is available at https://github.com/lzc-sg/PureCC.

Paper Structure (24 sections, 15 equations, 17 figures, 7 tables, 1 algorithm)

This paper contains 24 sections, 15 equations, 17 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Preliminary
Methodology
Learning Objective in PureCC
Representation Extractor
Pure Learning Pipeline in PureCC
Adaptive Guidance Scale $\lambda^\star$
Experiments
Experimental Setup
Qualitative Evaluation
Quantitative Evaluation
Ablation Study
Conclusion
Acknowledgement
...and 9 more sections

Figures (17)

Figure 1: We introduce PureCC, a novel concept customization approach. (a) PureCC effectively maintains target-unrelated image elements with original model's behavior after the personalized concept insertion. (b) Existing methods such as DreamBooth ruiz2023dreambooth and LoRA hu2021lora fail to follow the prompt 'placed on a bright window' during custom generation. (c) The declined curve indicates that existing methods compromise the original model's ability of prompt adherence (CLIP-T radford2021learning) and generating high-quality images (HPSv2.1 wu2023human).
Figure 2: Original Distribution Drift. Visualization and KL Divergence results demonstrated that existing methods, which adjust pre-trained models to align with the target distribution for learning personalized concepts, lead to distribution drift.
Figure 3: Overview of our PureCC.(a). We first fine-tune a flow model on the custom set as representation extractor. (b). During the pure learning stage, the representation extractor remains frozen and provides the target concept representation, which is then controlled by our adaptive scale $\lambda^\star$ to implicitly guide the trainable model. The trainable model is initialized from another pre-trained flow model and provides original conditional prediction using the Base Text as input. The entire pipeline is trained on the custom set using $\mathcal{L}_{PureCC}$ and $\mathcal{L}_{CC}$. (c). demonstrates the process of using our designed $\mathcal{L}_{PureCC}$ to purely learn the target concept in the velocity flow space.
Figure 4: Motivation of Adaptive Scale $\lambda^\star$. A small$\lambda$ can preserve the original model's behavior and capabilities but leads to a decrease in the fidelity of the target concept. Conversely, when $\lambda$ is excessively large, the personalized concept dominates the learning objective, causing the final distribution to drift away from the original distribution. This results in a degradation of the model’s generative ability: the underlying prompt cannot be followed and lower CLIP-T and HPSv2.1 scores.
Figure 5: Qualitative Comparison with SOTAs including Tuning-based methods: DreamBooth ruiz2023dreambooth, DreamBooth + EWC serra2018overcoming, Mix-of-Show gu2023mix, CIFC dong2024continually, and Tuning-free methods: DreamO mou2025dreamo UNO wu2025less.
...and 12 more figures

PureCC: Pure Learning for Text-to-Image Concept Customization

TL;DR

Abstract

PureCC: Pure Learning for Text-to-Image Concept Customization

Authors

TL;DR

Abstract

Table of Contents

Figures (17)