AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

Zeyu Liu; Souvik Kundu; Anni Li; Junrui Wan; Lianghao Jiang; Peter Anthony Beerel

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

Zeyu Liu, Souvik Kundu, Anni Li, Junrui Wan, Lianghao Jiang, Peter Anthony Beerel

TL;DR

A novel Parameter-Efficient Fine-Tuning (PEFT) method, dubbed as Adaptive Freezing of Low Rank Adaptation (AFLoRA), which incrementally freeze these projection matrices during fine-tuning to reduce the computation and alleviate over-fitting.

Abstract

We present a novel Parameter-Efficient Fine-Tuning (PEFT) method, dubbed as Adaptive Freezing of Low Rank Adaptation (AFLoRA). Specifically, for each pre-trained frozen weight tensor, we add a parallel path of trainable low-rank matrices, namely a down-projection and an up-projection matrix, each of which is followed by a feature transformation vector. Based on a novel freezing score, we the incrementally freeze these projection matrices during fine-tuning to reduce the computation and alleviate over-fitting. Our experimental results demonstrate that we can achieve state-of-the-art performance with an average improvement of up to $0.85\%$ as evaluated on GLUE benchmark while yeilding up to $9.5\times$ fewer average trainable parameters. While compared in terms of runtime, AFLoRA can yield up to $1.86\times$ improvement as opposed to similar PEFT alternatives. Besides the practical utility of our approach, we provide insights on the trainability requirements of LoRA paths at different modules and the freezing schedule for the different projection matrices. Code will be released.

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

TL;DR

Abstract

as evaluated on GLUE benchmark while yeilding up to

fewer average trainable parameters. While compared in terms of runtime, AFLoRA can yield up to

improvement as opposed to similar PEFT alternatives. Besides the practical utility of our approach, we provide insights on the trainability requirements of LoRA paths at different modules and the freezing schedule for the different projection matrices. Code will be released.

Paper Structure (12 sections, 5 equations, 5 figures, 8 tables)

This paper contains 12 sections, 5 equations, 5 figures, 8 tables.

Introduction
Related Works
Motivational Case Study
AFLoRA: Methodology
Experiments
Ablations and Discussions
Conclusions
Limitation
Appendix
Dataset
Hyperparameter configuration
Ablation study on if freezing the two projection matrices in the same layer simultaneously

Figures (5)

Figure 1: Schematic comparison of LoRA hu2021lora, ELoRA kopiczko2024elora, and AFLoRA and their associated advantages and disadvantages in terms of various metrics. $r_L$ and $r_V$, represent the rank of the low-rank path used in LoRA and ELoRA methods, respectively. FT and KU refer to fine-tuned weights and the Kaiming uniform initialization function, respectively.
Figure 2: Performance of ELoRA with two different ranks of the frozen projection matrices.
Figure 3: A comparison of various system performances between LoRA, ELoRA, and AFLoRA.
Figure 4: A comparison of performance outcomes utilizing three distinct freezing score methodologies.
Figure 5: Visualization of freezing iterations for each layer. 'out' and 'inter' refer to the second and the first MLP layer of the FFN, respectively. 'A' and 'B' represent the down-projection and up-projection matrix, respectively. The darker the color, the more iterations the matrix has to go through before freezing.

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

TL;DR

Abstract

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)