LoR2C : Low-Rank Residual Connection Adaptation for Parameter-Efficient Fine-Tuning
Jiancheng Zhao, Xingda Yu, Yuxiang Zhang, Zhen Yang
TL;DR
LoR2C introduces a low-rank residual connection adaptation for parameter-efficient fine-tuning, combining residual pathways with low-rank matrices so that layer updates are captured by $W=BA$ with $r\ll d$, reducing tunable parameters and alleviating gradient vanishing. It further offers ShareLoR2C, MergeLoR2C, and InjectLoR2C to trade off parameter count and performance via parameter sharing, dynamic merging, and rank-aware injections, guided by the Shape of Feature Space (SFS) metric. Across GLUE with RoBERTa-base and instruction-tuning with LLAMA2-7B, LoR2C variants achieve competitive or superior results using far fewer trainable parameters than full fine-tuning or many PEFT baselines, including strong performance on BBH and HEval benchmarks. The work demonstrates practical gains in efficiency and gradient propagation for Transformer fine-tuning, highlighting a versatile direction for deploying large models under resource constraints, while noting added complexity and the need for broader scalability validation.
Abstract
In recent years, pretrained large language models have demonstrated outstanding performance across various natural language processing tasks. However, full-parameter fine-tuning methods require adjusting all model parameters, leading to immense computational resource demands. Although parameter-efficient fine-tuning methods like LoRA have significantly reduced the number of parameters, they still face challenges such as gradient vanishing and the potential for further parameter reduction. To address these issues, this paper proposes a novel parameter-efficient fine-tuning method called LoR2C (Low-Rank Residual Connection Adaptation). LoR2C introduces residual connections with low-rank matrices within the model layers, which not only reduces the number of fine-tuning parameters but also effectively alleviates the gradient vanishing problem. Additionally, this paper presents three optimization variants of LoR2C: ShareLoR2C, MergeLoR2C, and InjectLoR2C. These variants further improve parameter efficiency and model performance through parameter sharing, module merging, and injection mechanisms, respectively. Experimental results on multiple natural language understanding and natural language generation tasks demonstrate that LoR2C and its optimized variants significantly reduce parameter overhead while maintaining or even improving performance, outperforming existing mainstream parameter-efficient fine-tuning methods.Our code is publicly available at https://github.com/Oblivioniss/LoR2C.
