One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Arnav Chavan; Zhuang Liu; Deepak Gupta; Eric Xing; Zhiqiang Shen

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Arnav Chavan, Zhuang Liu, Deepak Gupta, Eric Xing, Zhiqiang Shen

TL;DR

GLoRA introduces Generalized LoRA, a One-for-All PEFT framework that jointly tunes weights and activations through a unified formulation with trainable support tensors. A structural re-parameterization enables zero extra inference cost, while an evolutionary search over a large per-layer supernet yields task-specific adapters without manual hyperparameter tuning. Across VTAB-1K, large language models, few-shot, and domain-generalization benchmarks, GLoRA consistently surpasses prior PEFT methods with fewer trainable parameters. The work demonstrates strong cross-domain applicability and practical efficiency for resource-constrained deployment, supported by analysis of layer-wise adaptation and capacity expansion via VC-dimension arguments.

Abstract

We present Generalized LoRA (GLoRA), an advanced approach for universal parameter-efficient fine-tuning tasks. Enhancing Low-Rank Adaptation (LoRA), GLoRA employs a generalized prompt module to optimize pre-trained model weights and adjust intermediate activations, providing more flexibility and capability across diverse tasks and datasets. Moreover, GLoRA facilitates efficient parameter adaptation by employing a scalable, modular, layer-wise structure search that learns individual adapter of each layer. Originating from a unified mathematical formulation, GLoRA exhibits strong transfer learning, few-shot learning and domain generalization abilities, as it adapts to new tasks through not only weights but also additional dimensions like activations. Comprehensive experiments demonstrate that GLoRA outperforms all previous methods in natural, specialized, and structured vision benchmarks, achieving superior accuracy with fewer parameters and computations. The proposed method on LLaMA-1 and LLaMA-2 also show considerable enhancements compared to the original LoRA in the language domain. Furthermore, our structural re-parameterization design ensures that GLoRA incurs no extra inference cost, rendering it a practical solution for resource-limited applications. Code and models are available at: https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA.

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

TL;DR

Abstract

Paper Structure (23 sections, 1 theorem, 15 equations, 6 figures, 7 tables)

This paper contains 23 sections, 1 theorem, 15 equations, 6 figures, 7 tables.

Introduction
GLoRA
Previous Solutions with Limitations
A Unified Formulation of One-for-All
Structural Re-parameterization Design and Inference Efficiency Analysis
Evolutionary Search for Optimal Layer-wise Configurations
GLoRA with Higher Capacity
Experiments
Results on VTAB-1K
Results on Large Language Models
Few-shot Learning
Domain Generalization
Analysis and Discussion
Related Work
Conclusion
...and 8 more sections

Key Result

Theorem 1

Suppose $\mathbf d_\mathrm{vc}(\mathcal{H})$ is the VC dimension of any finite hypothesis $\mathcal{H}$. If $\mathcal{H}_\mathrm{i} \subseteq \mathcal{H}_\mathrm{uni}$,

Figures (6)

Figure 1: Schematic representation of a linear layer adapted with GLoRA.
Figure 2: Results on few-shot learning datasets. The baseline methods include Adapter, LoRA, VPT, NOAH. GLoRA consistently performs better across five datasets and a varying number of training examples per class. More comparisons are provided in Appendix \ref{['appendix_few_shot']}.
Figure 3: Distribution of GLoRA (0.86M) parameters across layer types on VTAB-1K. Q-K-V and Projection are linear layers in MHSA module and FC1 and FC2 are linear layers in MLP module.
Figure 4: Layerwise configuration of support tensors in GLoRA (0.86M) on VTAB-1K dataset.
Figure 5: Visualization of features from SVHN dataset by t-SNE van2008visualizing.
...and 1 more figures

Theorems & Definitions (1)

Theorem 1

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

TL;DR

Abstract

One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (1)