Uni-LoRA: One Vector is All You Need
Kaiyang Li, Shaobo Han, Qing Su, Wei Li, Zhipeng Cai, Shihao Ji
TL;DR
Uni-LoRA reframes LoRA fine-tuning as a global subspace projection from a compact vector $θ_d$ into a $D$-dimensional LoRA space via an isometric projection $P$, unifying existing LoRA variants as different choices of $P$. The key advance is a simple, fixed projection that partitions the full parameter set into $d$ groups, enabling global sharing with maximal efficiency while preserving geometry through isometry. The authors prove isometry for the proposed random-partition projection and demonstrate state-of-the-art parameter efficiency across GLUE, mathematical reasoning benchmarks, instruction tuning, and computer vision tasks, with orders-of-magnitude fewer trainable parameters than prior LoRA variants. They also show comparable or better performance with lower computational overhead than VB-LoRA and Fastfood-based approaches, supported by extensive ablations. Overall, Uni-LoRA offers a practical, scalable, and theoretically grounded path to highly parameter-efficient adaptation of large models without architectural changes.
Abstract
Low-Rank Adaptation (LoRA) has become the de facto parameter-efficient fine-tuning (PEFT) method for large language models (LLMs) by constraining weight updates to low-rank matrices. Recent works such as Tied-LoRA, VeRA, and VB-LoRA push efficiency further by introducing additional constraints to reduce the trainable parameter space. In this paper, we show that the parameter space reduction strategies employed by these LoRA variants can be formulated within a unified framework, Uni-LoRA, where the LoRA parameter space, flattened as a high-dimensional vector space $R^D$, can be reconstructed through a projection from a subspace R^d, with $d \ll D$. We demonstrate that the fundamental difference among various LoRA methods lies in the choice of the projection matrix, $P \in R^{D \times d}$.Most existing LoRA variants rely on layer-wise or structure-specific projections that limit cross-layer parameter sharing, thereby compromising parameter efficiency. In light of this, we introduce an efficient and theoretically grounded projection matrix that is isometric, enabling global parameter sharing and reducing computation overhead. Furthermore, under the unified view of Uni-LoRA, this design requires only a single trainable vector to reconstruct LoRA parameters for the entire LLM - making Uni-LoRA both a unified framework and a "one-vector-only" solution. Extensive experiments on GLUE, mathematical reasoning, and instruction tuning benchmarks demonstrate that Uni-LoRA achieves state-of-the-art parameter efficiency while outperforming or matching prior approaches in predictive performance. Our code is available at https://github.com/KaiyangLi1992/Uni-LoRA.
