DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization
Aniket Roy, Shubhankar Borse, Shreya Kadambi, Debasmit Das, Shweta Mahajan, Risheek Garrepalli, Hyojin Park, Ankita Nayak, Rama Chellappa, Munawar Hayat, Fatih Porikli
TL;DR
DuoLoRA addresses joint content-style personalization in diffusion models by introducing ZipRank, layer-prior merging, and Constyle cycle-consistency to enable adaptive rank-based merging with far fewer trainable parameters. It provides theoretical guarantees, including $E_{rank} \le E_{out}$ under equal parameter budgets and a nuclear-norm relaxation with a Lagrangian penalty to enforce rank constraints. Empirically, DuoLoRA outperforms state-of-the-art baselines across multiple benchmarks and is validated via user studies, demonstrating practical efficiency and quality in content-style blending. The approach scales to multi-concept stylization and supports recontextualization, with significant improvements in both objective metrics (DINO, CLIP-I, CLIP-T, CSD-s) and human judgments.
Abstract
We tackle the challenge of jointly personalizing content and style from a few examples. A promising approach is to train separate Low-Rank Adapters (LoRA) and merge them effectively, preserving both content and style. Existing methods, such as ZipLoRA, treat content and style as independent entities, merging them by learning masks in LoRA's output dimensions. However, content and style are intertwined, not independent. To address this, we propose DuoLoRA, a content-style personalization framework featuring three key components: (i) rank-dimension mask learning, (ii) effective merging via layer priors, and (iii) Constyle loss, which leverages cycle-consistency in the merging process. First, we introduce ZipRank, which performs content-style merging within the rank dimension, offering adaptive rank flexibility and significantly reducing the number of learnable parameters. Additionally, we incorporate SDXL layer priors to apply implicit rank constraints informed by each layer's content-style bias and adaptive merger initialization, enhancing the integration of content and style. To further refine the merging process, we introduce Constyle loss, which leverages the cycle-consistency between content and style. Our experimental results demonstrate that DuoLoRA outperforms state-of-the-art content-style merging methods across multiple benchmarks.
