Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

Mingkang Zhu; Xi Chen; Zhongdao Wang; Bei Yu; Hengshuang Zhao; Jiaya Jia

Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

Mingkang Zhu, Xi Chen, Zhongdao Wang, Bei Yu, Hengshuang Zhao, Jiaya Jia

TL;DR

The paper tackles modular, scalable diffusion-model customization to merge multiple user-trained concepts without erasing their identities. It identifies identity interference and identity loss as key obstacles and introduces BlockLoRA, which combines Randomized Output Erasure and Blockwise LoRA Parameterization to enforce niche concept representations and disjoint parameter updates. This enables instant merging of up to 15 concepts with high fidelity, demonstrated through extensive CLIP-based evaluations and qualitative analyses against strong baselines. The approach offers a plug-and-play pathway for multi-concept customization and concept stylization in diffusion models, with practical implications for decentralized concept sharing and collaboration.

Abstract

Recent diffusion model customization has shown impressive results in incorporating subject or style concepts with a handful of images. However, the modular composition of multiple concepts into a customized model, aimed to efficiently merge decentralized-trained concepts without influencing their identities, remains unresolved. Modular customization is essential for applications like concept stylization and multi-concept customization using concepts trained by different users. Existing post-training methods are only confined to a fixed set of concepts, and any different combinations require a new round of retraining. In contrast, instant merging methods often cause identity loss and interference of individual merged concepts and are usually limited to a small number of concepts. To address these issues, we propose BlockLoRA, an instant merging method designed to efficiently combine multiple concepts while accurately preserving individual concepts' identity. With a careful analysis of the underlying reason for interference, we develop the Randomized Output Erasure technique to minimize the interference of different customized models. Additionally, Blockwise LoRA Parameterization is proposed to reduce the identity loss during instant model merging. Extensive experiments validate the effectiveness of BlockLoRA, which can instantly merge 15 concepts of people, subjects, scenes, and styles with high fidelity.

Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

TL;DR

Abstract

Modular Customization of Diffusion Models via Blockwise-Parameterized Low-Rank Adaptation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)