FreeFuse: Multi-Subject LoRA Fusion via Auto Masking at Test Time

Yaoli Liu; Yao-Xiang Ding; Kun Zhou

FreeFuse: Multi-Subject LoRA Fusion via Auto Masking at Test Time

Yaoli Liu, Yao-Xiang Ding, Kun Zhou

TL;DR

FreeFuse tackles multi-subject generation in diffusion-based text-to-image models by deriving context-aware masks from cross-attention maps and applying them to LoRA outputs at test time. The approach, which requires no training, modifications to LoRAs, external segmentation models, or user-provided region prompts, mitigates inter-LoRA conflicts through attention-sink handling, self-attention locality, and a superpixel-based ensemble voting strategy. The method is evaluated against strong baselines and shows improvements in subject fidelity, prompt adherence, and image quality across challenging interactions. It enables practical, scalable multi-subject generation within standard diffusion workflows.

Abstract

This paper proposes FreeFuse, a novel training-free approach for multi-subject text-to-image generation through automatic fusion of multiple subject LoRAs. In contrast to existing methods that either focus on pre-inference LoRA weight merging or rely on segmentation models and complex techniques like noise blending to isolate LoRA outputs, our key insight is that context-aware dynamic subject masks can be automatically derived from cross-attention layer weights. Mathematical analysis shows that directly applying these masks to LoRA outputs during inference well approximates the case where the subject LoRA is integrated into the diffusion model and used individually for the masked region. FreeFuse demonstrates superior practicality and efficiency as it requires no additional training, no modification to LoRAs, no auxiliary models, and no user-defined prompt templates or region specifications. Alternatively, it only requires users to provide the LoRA activation words for seamless integration into standard workflows. Extensive experiments validate that FreeFuse outperforms existing approaches in both generation quality and usability under the multi-subject generation tasks. The project page is at https://future-item.github.io/FreeFuse/

FreeFuse: Multi-Subject LoRA Fusion via Auto Masking at Test Time

TL;DR

Abstract

FreeFuse: Multi-Subject LoRA Fusion via Auto Masking at Test Time

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (63)