LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers
Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag
TL;DR
The paper tackles multi-concept personalization editing in diffusion models by addressing LoRA cross-talk without retraining. It introduces LoRAShop, a training-free pipeline that extracts region-specific subject priors from Flux rectified-flow transformers and performs per-token residual blending to insert multiple concepts into an image. Subject priors $ hat{M}_{c'}$ are derived from the last double-stream block via $M_{c'} = \operatorname{softmax}(Q_i K_{c'}^{\mathsf T}/\sqrt{d})$, smoothed and binarized to obtain non-overlapping $ hat{M}_u$, which guide blending with per-token weights $\alpha_{c'}(p) = \nfrac{\hat{M}_{c'}(p)}{\sum_u \hat{M}_u(p) + \varepsilon}$. Experiments show improved identity preservation and natural composition for single and multiple concepts, supporting real and generated image editing in a practical, training-free framework that enables rapid creative iteration.
Abstract
We introduce LoRAShop, the first framework for multi-concept image editing with LoRA models. LoRAShop builds on a key observation about the feature interaction patterns inside Flux-style diffusion transformers: concept-specific transformer features activate spatially coherent regions early in the denoising process. We harness this observation to derive a disentangled latent mask for each concept in a prior forward pass and blend the corresponding LoRA weights only within regions bounding the concepts to be personalized. The resulting edits seamlessly integrate multiple subjects or styles into the original scene while preserving global context, lighting, and fine details. Our experiments demonstrate that LoRAShop delivers better identity preservation compared to baselines. By eliminating retraining and external constraints, LoRAShop turns personalized diffusion models into a practical `photoshop-with-LoRAs' tool and opens new avenues for compositional visual storytelling and rapid creative iteration.
