The Expressive Power of Low-Rank Adaptation
Yuchen Zeng, Kangwook Lee
TL;DR
This work provides the first theoretical analysis of LoRA's expressive power for frozen pretrained networks, establishing explicit rank thresholds for exact adaptation in fully connected nets and Transformer architectures. It shows that, for FNNs, LoRA can match a target function when the per-adapter rank meets a threshold linked to network depth and width, and that for Transformer blocks, updating attention weights with LoRA suffices under a rank near half the embedding size. The paper also introduces uniform and general model-partition strategies to reduce the required rank, derives approximation bounds when the rank is below threshold, and contrasts LoRA with final-layer tuning. Empirical experiments on synthetic and real data validate the constructions and illustrate practical implications for designing LoRA adapters, including the impact of model proximity and biases on expressive power. Overall, the results illuminate why LoRA can be so effective in practice and provide a theoretical foundation for adapter design choices.
Abstract
Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method that leverages low-rank adaptation of weight matrices, has emerged as a prevalent technique for fine-tuning pre-trained models such as large language models and diffusion models. Despite its huge success in practice, the theoretical underpinnings of LoRA have largely remained unexplored. This paper takes the first step to bridge this gap by theoretically analyzing the expressive power of LoRA. We prove that, for fully connected neural networks, LoRA can adapt any model $f$ to accurately represent any smaller target model $\overline{f}$ if LoRA-rank $\geq(\text{width of }f) \times \frac{\text{depth of }\overline{f}}{\text{depth of }f}$. We also quantify the approximation error when LoRA-rank is lower than the threshold. For Transformer networks, we show any model can be adapted to a target model of the same size with rank-$(\frac{\text{embedding size}}{2})$ LoRA adapters.
