Optimizing Fine-Tuning through Advanced Initialization Strategies for Low-Rank Adaptation
Yongfu Xue
TL;DR
The paper addresses the initialization bottleneck in LoRA-based parameter-efficient fine-tuning by introducing IniLoRA, which optimizes a low-rank decomposition BA to approximate the original weight matrix W0 and fixes the residual during training. It further explores two initialization variants, IniLoRA-α and IniLoRA-β, to expand the initialization design space. Through weight-approximation experiments and extensive NLU/NLG benchmarks, IniLoRA demonstrates consistent improvements over LoRA and other PEFT methods, with the α and β variants often yielding the best results. The work provides practical insights into how initialization strategy and weight-approximation quality affect convergence, scalability, and robustness in PEFT for large language models.
Abstract
The rapid development of parameter-efficient fine-tuning methods has noticeably improved the efficiency of adapting large language models. Among these, LoRA has gained widespread popularity due to its strong balance of effectiveness and parameter efficiency. However, LoRA relies on initializing two low-rank matrices whose product is zero, which limits its ability to effectively activate and leverage the original model weights-creating a potential bottleneck for optimal performance. To address this limitation, we propose \textbf{IniLoRA}, a novel initialization strategy that initializes the low-rank matrices to closely approximate the original model weights. Experimental results indicate that IniLoRA achieves better performance than LoRA across a range of models and tasks. Additionally, we introduce two variants, IniLoRA-$α$ and IniLoRA-$β$, both leveraging distinct initialization methods to enhance performance further.
