ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

Habin Lim; Yeongseob Won; Juwon Seo; Gyeong-Moon Park

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

Habin Lim, Yeongseob Won, Juwon Seo, Gyeong-Moon Park

TL;DR

ConceptSplit addresses the problem of concept mixing in multi-concept diffusion model personalization by decoupling concept adaptation and attention control. It introduces Token-wise Value Adaptation (ToVA), which updates only the value projection for targeted tokens to avoid merging adapters, and Latent Optimization for Disentangled Attention (LODA), a two-stage latent-space approach that separates and then fixes attention to reduce entanglement. The method yields merging-free personalization with preserved token-attention binding and state-of-the-art disentanglement across benchmarks, validated by both quantitative metrics and qualitative analysis. It improves compositional fidelity and reduces interference, offering practical benefits for robust multi-concept synthesis in diffusion-based image generation, with code available online.

Abstract

In recent years, multi-concept personalization for text-to-image (T2I) diffusion models to represent several subjects in an image has gained much more attention. The main challenge of this task is "concept mixing", where multiple learned concepts interfere or blend undesirably in the output image. To address this issue, in this paper, we present ConceptSplit, a novel framework to split the individual concepts through training and inference. Our framework comprises two key components. First, we introduce Token-wise Value Adaptation (ToVA), a merging-free training method that focuses exclusively on adapting the value projection in cross-attention. Based on our empirical analysis, we found that modifying the key projection, a common approach in existing methods, can disrupt the attention mechanism and lead to concept mixing. Second, we propose Latent Optimization for Disentangled Attention (LODA), which alleviates attention entanglement during inference by optimizing the input latent. Through extensive qualitative and quantitative experiments, we demonstrate that ConceptSplit achieves robust multi-concept personalization, mitigating unintended concept interference. Code is available at https://github.com/KU-VGI/ConceptSplit

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

TL;DR

Abstract

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (12)