SplitFlux: Learning to Decouple Content and Style from a Single Image
Yitong Yang, Yinglin Wang, Changshuo Wang, Yongjun Zhang, Ziyang Chen, Shuting He
TL;DR
<3-5 sentence high-level summary>SplitFlux analyzes the Flux diffusion model to identify which blocks control content versus style, then introduces two components—Rank-Constrained Adaptation and Visual-Gated LoRA—to disentangle content from style and re-embed disentangled content into new contexts. RCA constrains low-rank updates at semantic boundary blocks to preserve identity and structure, while VGRA uses saliency-guided routing to assign high-rank content updates to foreground regions and low-rank residuals to details, with a complementary loss to promote diverse representations. Experimental results demonstrate improved content preservation and competitive stylization across settings, outperforming prior LoRA-based methods in both disentanglement and recontextualization tasks with greater efficiency. Overall, SplitFlux provides a practical, parameter-efficient framework for single-image content–style disentanglement in diffusion-based generation.
Abstract
Disentangling image content and style is essential for customized image generation. Existing SDXL-based methods struggle to achieve high-quality results, while the recently proposed Flux model fails to achieve effective content-style separation due to its underexplored characteristics. To address these challenges, we conduct a systematic analysis of Flux and make two key observations: (1) Single Stream Blocks are essential for image generation; and (2) Early single stream blocks mainly control content, whereas later blocks govern style. Based on these insights, we propose SplitFlux, which disentangles content and style by fine-tuning the single stream blocks via LoRA, enabling the disentangled content to be re-embedded into new contexts. It includes two key components: (1) Rank-Constrained Adaptation. To preserve content identity and structure, we compress the rank and amplify the magnitude of updates within specific blocks, preventing content leakage into style blocks. (2) Visual-Gated LoRA. We split the content LoRA into two branches with different ranks, guided by image saliency. The high-rank branch preserves primary subject information, while the low-rank branch encodes residual details, mitigating content overfitting and enabling seamless re-embedding. Extensive experiments demonstrate that SplitFlux consistently outperforms state-of-the-art methods, achieving superior content preservation and stylization quality across diverse scenarios.
