SuperLoRA: Parameter-Efficient Unified Adaptation of Multi-Layer Attention Modules
Xiangyu Chen, Jing Liu, Ye Wang, Pu Perry Wang, Matthew Brand, Guanghui Wang, Toshiaki Koike-Akino
TL;DR
A generalized framework called SuperLoRA is proposed that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings, and demonstrates superior performance for transfer learning tasks especially in the extremely few-parameter regimes.
Abstract
Low-rank adaptation (LoRA) and its variants are widely employed in fine-tuning large models, including large language models for natural language processing and diffusion models for computer vision. This paper proposes a generalized framework called SuperLoRA that unifies and extends different LoRA variants, which can be realized under different hyper-parameter settings. Introducing grouping, folding, shuffling, projecting, and tensor factoring, SuperLoRA offers high flexibility compared with other LoRA variants and demonstrates superior performance for transfer learning tasks especially in the extremely few-parameter regimes.
