MultLFG: Training-free Multi-LoRA composition using Frequency-domain Guidance
Aniket Roy, Maitreya Suin, Ketul Shah, Rama Chellappa
TL;DR
MultLFG tackles the challenge of training-free multi-LoRA composition by introducing frequency-guided, adaptive fusion in the wavelet domain. By decomposing latent and image representations into multi-scale frequency subbands and applying timestep-aware, adaptive weights to top-k LoRAs per subband, it reduces concept interference and improves compositional fidelity. The approach achieves consistent gains on the ComposLoRA benchmark over multiple baselines, validated with CLIP-based metrics, GPT-4V evaluations, and human studies, while detailing ablations that underline the contribution of frequency guidance and adaptive merging. This work offers a practical, training-free pathway to more controllable and reliable multi-concept image synthesis in diffusion models.
Abstract
Low-Rank Adaptation (LoRA) has gained prominence as a computationally efficient method for fine-tuning generative models, enabling distinct visual concept synthesis with minimal overhead. However, current methods struggle to effectively merge multiple LoRA adapters without training, particularly in complex compositions involving diverse visual elements. We introduce MultLFG, a novel framework for training-free multi-LoRA composition that utilizes frequency-domain guidance to achieve adaptive fusion of multiple LoRAs. Unlike existing methods that uniformly aggregate concept-specific LoRAs, MultLFG employs a timestep and frequency subband adaptive fusion strategy, selectively activating relevant LoRAs based on content relevance at specific timesteps and frequency bands. This frequency-sensitive guidance not only improves spatial coherence but also provides finer control over multi-LoRA composition, leading to more accurate and consistent results. Experimental evaluations on the ComposLoRA benchmark reveal that MultLFG substantially enhances compositional fidelity and image quality across various styles and concept sets, outperforming state-of-the-art baselines in multi-concept generation tasks. Code will be released.
