Table of Contents
Fetching ...

SuperLoRA: Parameter-Efficient Unified Adaptation of Multi-Layer Attention Modules

Xiangyu Chen, Jing Liu, Ye Wang, Pu Perry Wang, Matthew Brand, Guanghui Wang, Toshiaki Koike-Akino

TL;DR

A generalized framework called SuperLoRA is proposed that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings, and demonstrates superior performance for transfer learning tasks especially in the extremely few-parameter regimes.

Abstract

Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computer vision. This paper proposes a generalized framework called SuperLoRA that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings. Introducing grouping, folding, shuffling, projecting, and tensor factoring, SuperLoRA offers high flexibility compared with other LoRA variants and demonstrates superior performance for transfer learning tasks especially in the extremely few-parameter regimes.

SuperLoRA: Parameter-Efficient Unified Adaptation of Multi-Layer Attention Modules

TL;DR

A generalized framework called SuperLoRA is proposed that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings, and demonstrates superior performance for transfer learning tasks especially in the extremely few-parameter regimes.

Abstract

Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computer vision. This paper proposes a generalized framework called SuperLoRA that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings. Introducing grouping, folding, shuffling, projecting, and tensor factoring, SuperLoRA offers high flexibility compared with other LoRA variants and demonstrates superior performance for transfer learning tasks especially in the extremely few-parameter regimes.
Paper Structure (39 sections, 5 equations, 30 figures, 1 table)

This paper contains 39 sections, 5 equations, 30 figures, 1 table.

Figures (30)

  • Figure 1: Schematic of SuperLoRA to fine-tune multi-layer attention modules at once with vectorizing, grouping, projection, folding, and factorization.
  • Figure 2: Hyperparameters and notation.
  • Figure 3: Required number of parameters.
  • Figure 4: Overview of (a) LoRA; (b) LoKr; (c) LoNKr (weight-wise version, ours).
  • Figure 5: Classification on CIFAR100 dataset with SuperLoRA.
  • ...and 25 more figures