Table of Contents
Fetching ...

Rethinking Inter-LoRA Orthogonality in Adapter Merging: Insights from Orthogonal Monte Carlo Dropout

Andi Zhang, Xuan Ding, Haofan Wang, Steven McDonagh, Samuel Kaski

TL;DR

This paper tackles the challenge of semantically composing multiple LoRA adapters without interference and questions the assumption that inter-LoRA orthogonality yields disentangled semantics. It introduces Orthogonal Monte Carlo Dropout, a dropout-based mechanism that enforces runtime orthogonality with negligible overhead and provides theoretical consistency guarantees, while revealing widespread LoRA redundancy. Through experiments on DreamBooth and community-sourced LoRAs, the authors show that orthogonality does not enhance semantic compositionality or image quality, challenging prior claims and highlighting a need to rethink adapter merging strategies. The work suggests balancing interference reduction with constructive interactions to achieve effective semantic fusion in practice.

Abstract

We propose Orthogonal Monte Carlo Dropout, a mechanism that enforces strict orthogonality when combining sparse semantic vectors without extra time complexity. Low-Rank Adaptation (LoRA), a popular fine-tuning method for large models, typically trains a module to represent a specific concept such as an object or a style. When multiple LoRA modules are merged, for example to generate an object in a particular style, their outputs (semantic vectors) may interfere with each other. Our method guarantees that merged LoRA modules remain orthogonal and thus free from direct interference. However, empirical analysis reveals that such orthogonality does not lead to the semantic disentanglement highlighted in prior work on compositional adaptation. This finding suggests that inter-LoRA orthogonality alone may be insufficient for achieving true semantic compositionality, prompting a re-examination of its role in adapter merging.

Rethinking Inter-LoRA Orthogonality in Adapter Merging: Insights from Orthogonal Monte Carlo Dropout

TL;DR

This paper tackles the challenge of semantically composing multiple LoRA adapters without interference and questions the assumption that inter-LoRA orthogonality yields disentangled semantics. It introduces Orthogonal Monte Carlo Dropout, a dropout-based mechanism that enforces runtime orthogonality with negligible overhead and provides theoretical consistency guarantees, while revealing widespread LoRA redundancy. Through experiments on DreamBooth and community-sourced LoRAs, the authors show that orthogonality does not enhance semantic compositionality or image quality, challenging prior claims and highlighting a need to rethink adapter merging strategies. The work suggests balancing interference reduction with constructive interactions to achieve effective semantic fusion in practice.

Abstract

We propose Orthogonal Monte Carlo Dropout, a mechanism that enforces strict orthogonality when combining sparse semantic vectors without extra time complexity. Low-Rank Adaptation (LoRA), a popular fine-tuning method for large models, typically trains a module to represent a specific concept such as an object or a style. When multiple LoRA modules are merged, for example to generate an object in a particular style, their outputs (semantic vectors) may interfere with each other. Our method guarantees that merged LoRA modules remain orthogonal and thus free from direct interference. However, empirical analysis reveals that such orthogonality does not lead to the semantic disentanglement highlighted in prior work on compositional adaptation. This finding suggests that inter-LoRA orthogonality alone may be insufficient for achieving true semantic compositionality, prompting a re-examination of its role in adapter merging.

Paper Structure

This paper contains 26 sections, 3 theorems, 16 equations, 5 figures, 2 tables, 3 algorithms.

Key Result

Theorem 1

(Consistency) Let the random mask , then $m^{(j)}_i \sim \text{Ber}(1-p_j)$.

Figures (5)

  • Figure 1: Comparison of LoRA merging strategies. In this example, the input is $h$, and the outputs of the LoRA modules are $\Delta W_1 h, \Delta W_2 h$, and $\Delta W_3 h$. The masks $z^{(1)}, z^{(2)}, z^{(3)}$ are generated according to different merging strategies; in the figure, dark gray indicates 1 and white indicates 0. For simplicity, we set $w_i = 1$ and $p_i = 1/3$ for $i \in {1,2,3}$. (Top Left) Direct Merge simply sums the LoRA updates, which may lead to interference when updates are aligned. (Top Right) Monte Carlo Dropout merge (Algorithm \ref{['alg:dropout']} and Algorithm \ref{['alg:merging']}). (Bottom) Orthogonal Monte Carlo Dropout (Algorithm \ref{['alg:orthdropout']} and Algorithm \ref{['alg:merging']}), which enforces orthogonality among LoRA contributions and ensures no direct overlap.
  • Figure 2: Empirical study of Monte-Carlo Dropout under different dropout rates (0.1 - 0.9). All LoRAs are trained on Dreambooth dataset with a training dropout rate of 0.05.
  • Figure 3: Empirical study of Monte-Carlo Dropout under different sampling dropout rates (0.1–0.9) using LoRAs downloaded from the community. According to the metadata, the first row (cartoon character Kimoju) was trained with a dropout rate of 0.9, the second row (Keqing from Genshin Impact) was trained without dropout, and the third row (cloud style LoRA) does not disclose its training dropout rate.
  • Figure 4: Examples of DreamBooth LoRA merges. 'Direct' denotes Direct Merge, 'Dropout' denotes Monte Carlo Dropout, and 'Orthogonal' denotes Orthogonal Monte Carlo Dropout.
  • Figure 5: Merging LoRAs from the community. 'Direct' denotes Direct Merge, 'Dropout' denotes Monte Carlo Dropout, and 'Orthogonal' denotes Orthogonal Monte Carlo Dropout.

Theorems & Definitions (6)

  • Theorem 1
  • proof
  • Proposition 1
  • proof
  • Theorem 2
  • proof