Table of Contents
Fetching ...

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

Feiyu Zhang, Liangzhi Li, Junhao Chen, Zhouqiang Jiang, Bowen Wang, Yiming Qian

TL;DR

This work tackles the high cost of fine-tuning large pre-trained language models by introducing IncreLoRA, an incremental parameter allocation method that adaptively adds trainable, low-rank updates per module based on importance scores. By reconstructing LoRA updates with a scalable, SVD-like form and employing advance learning to initialize new parameters, IncreLoRA achieves higher rank upper bounds without increasing training overhead and enhances stability via restart warmup. Extensive GLUE experiments show strong parameter efficiency, particularly in low-resource settings, with IncreLoRA outperforming or matching higher-budget baselines. Overall, the approach offers a practical, non-pruning PEFT alternative that improves efficiency and performance for downstream tasks.

Abstract

With the increasing size of pre-trained language models (PLMs), fine-tuning all the parameters in the model is not efficient, especially when there are a large number of downstream tasks, which incur significant training and storage costs. Many parameter-efficient fine-tuning (PEFT) approaches have been proposed, among which, Low-Rank Adaptation (LoRA) is a representative approach that injects trainable rank decomposition matrices into every target module. Yet LoRA ignores the importance of parameters in different modules. To address this problem, many works have been proposed to prune the parameters of LoRA. However, under limited training conditions, the upper bound of the rank of the pruned parameter matrix is still affected by the preset values. We, therefore, propose IncreLoRA, an incremental parameter allocation method that adaptively adds trainable parameters during training based on the importance scores of each module. This approach is different from the pruning method as it is not limited by the initial number of training parameters, and each parameter matrix has a higher rank upper bound for the same training overhead. We conduct extensive experiments on GLUE to demonstrate the effectiveness of IncreLoRA. The results show that our method owns higher parameter efficiency, especially when under the low-resource settings where our method significantly outperforms the baselines. Our code is publicly available.

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

TL;DR

This work tackles the high cost of fine-tuning large pre-trained language models by introducing IncreLoRA, an incremental parameter allocation method that adaptively adds trainable, low-rank updates per module based on importance scores. By reconstructing LoRA updates with a scalable, SVD-like form and employing advance learning to initialize new parameters, IncreLoRA achieves higher rank upper bounds without increasing training overhead and enhances stability via restart warmup. Extensive GLUE experiments show strong parameter efficiency, particularly in low-resource settings, with IncreLoRA outperforming or matching higher-budget baselines. Overall, the approach offers a practical, non-pruning PEFT alternative that improves efficiency and performance for downstream tasks.

Abstract

With the increasing size of pre-trained language models (PLMs), fine-tuning all the parameters in the model is not efficient, especially when there are a large number of downstream tasks, which incur significant training and storage costs. Many parameter-efficient fine-tuning (PEFT) approaches have been proposed, among which, Low-Rank Adaptation (LoRA) is a representative approach that injects trainable rank decomposition matrices into every target module. Yet LoRA ignores the importance of parameters in different modules. To address this problem, many works have been proposed to prune the parameters of LoRA. However, under limited training conditions, the upper bound of the rank of the pruned parameter matrix is still affected by the preset values. We, therefore, propose IncreLoRA, an incremental parameter allocation method that adaptively adds trainable parameters during training based on the importance scores of each module. This approach is different from the pruning method as it is not limited by the initial number of training parameters, and each parameter matrix has a higher rank upper bound for the same training overhead. We conduct extensive experiments on GLUE to demonstrate the effectiveness of IncreLoRA. The results show that our method owns higher parameter efficiency, especially when under the low-resource settings where our method significantly outperforms the baselines. Our code is publicly available.
Paper Structure (27 sections, 6 equations, 8 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 6 equations, 8 figures, 5 tables, 1 algorithm.

Figures (8)

  • Figure 1: Fine-tuning results for different methods and parameter budget on the GLUE benchmark, all experiments are based on DeBERTaV3-base. We compare our method with BitFit peft:bitfit, PAdapter peft:PAdapter, HAdapter peft:adapter, and AdaLoRA peft:lora, with the x-axis representing the number of parameters (M), and the y-axis representing the average score (Avg). Our approach achieves a better trade-off between efficiency and performance.
  • Figure 2: We illustrate the variations in rank of layer.10.attention.self.value_proj when fine-tuning DeBERTaV3-base on MNLI, using IncreLoRA and Pruning LoRA methods respectively.
  • Figure 3: An illustration of the low-rank adapters in our model, where $x$ is the input of each module and $h$ is the output of the module. The update matrix $\Delta W$ can be decomposed into $r$ components and an additional preparatory component. where $\lambda_s$ is fixed to 1e-5 and the remaining parameters are trainable.
  • Figure 4: Fine-tuning performance under different parameter budgets. The x-axis represents the average rank, and the y-axis is the evaluation index of different data sets. Set the same learning rate under different parameter budgets.
  • Figure 5: Regularization loss with and without advance learning. To save space, the middle part that does not contain critical information is truncated.
  • ...and 3 more figures