Unlocking the Global Synergies in Low-Rank Adapters

Zixi Zhang; Cheng Zhang; Xitong Gao; Robert D. Mullins; George A. Constantinides; Yiren Zhao

Unlocking the Global Synergies in Low-Rank Adapters

Zixi Zhang, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

TL;DR

HeteroLoRA is presented, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance.

Abstract

Low-rank Adaption (LoRA) has been the de-facto parameter-efficient fine-tuning technique for large language models. We present HeteroLoRA, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance. In addition to the allocation for the standard LoRA-adapted models, we also demonstrate the efficacy of HeteroLoRA by performing the allocation in a more challenging search space that includes LoRA modules and LoRA-adapted shortcut connections. Experiments show that HeteroLoRA enables improvements in model performance given the same parameter budge. For example, on MRPC, we see an improvement of 1.6% in accuracy with similar training parameter budget. We will open-source our algorithm once the paper is accepted.

Unlocking the Global Synergies in Low-Rank Adapters

TL;DR

HeteroLoRA is presented, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance.

Abstract

Paper Structure (27 sections, 12 equations, 6 figures, 10 tables)

This paper contains 27 sections, 12 equations, 6 figures, 10 tables.

Introduction
HeteroLoRA
Saliency Estimation using Zero-Cost Proxies
Static or Dynamic?
Static HeteroLoRA
Dynamic HeteroLoRA
Extending the Search Space with LoRA-Adapted Shortcut Connections
Experiments
Experiment Setup
Determining Proxy and Training Strategy of HeteroLoRA
Verifying Performance Gain of Shortcuts
Dynamic HeteroLoRA with Extended Search Space
Conclusions
Primary LoRA Experiments
Hyperparameters
...and 12 more sections

Figures (6)

Figure 1: An illustration of the HeteroLoRA search space in a Transformer model. Given a fixed number of trainable parameters, HeteroLoRA finds an efficient heterogeneous LoRA configuration for a model on a specific task. Each of the standard LoRA module and LoRA-adapted shortcut can be enabled or disabled.
Figure 2: Training pipeline of (a) static HeteroLoRA, where LoRA modules are enabled/disabled at the start of training, and (b) dynamic HeteroLoRA, where LoRA modules are enabled/disabled periodically, e.g., every $1/5$ epoch, during the training.
Figure 3: LoRA-adapted shortcut architecture on two Transformer layers with post-layer-normalisation. Blue blocks are the residual shortcuts $s_\text{res1}$ and $s_\text{res2}$. Green blocks are the $s_\text{in}$ same-layer style "cross-layer" shortcuts. Red blocks are the $s_\text{cut}$ "cut-layer" style cross-layer shortcuts.
Figure 4: Frequency of linear projections in every model layer being enabled in the dynamic HeteroLoRA trained on MRPC for LoRA and shortcut modules. A noticeable preference for value projections over query projections indicates that the value update generally contributes more to the fine-tuned performance.
Figure 5: Frequency of linear projections in every model layer being enabled in the dynamic HeteroLoRA training on RTE with (a) $r=8$ and (b) $r=32$ for LoRA and shortcut modules. For each rank value, combined LoRA allocation is compared to separated LoRA allocation.
...and 1 more figures

Unlocking the Global Synergies in Low-Rank Adapters

TL;DR

Abstract

Unlocking the Global Synergies in Low-Rank Adapters

Authors

TL;DR

Abstract

Table of Contents

Figures (6)