Table of Contents
Fetching ...

EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference

Prakhar Kaushik, Ankit Vaidya, Shravan Chaudhari, Alan Yuille

TL;DR

EigenLoRAx recycles pretrained LoRA adapters to identify a shared, task-invariant principal subspace and enables new-task adaptation by learning lightweight coefficients over this subspace. By aggregating LoRA weights and extracting top PCs, it augments the subspace with pseudo-PCs when data is scarce and uses a fixed subspace during training, drastically reducing trainable parameters while maintaining comparable accuracy across vision and language tasks. Theoretical bounds and extensive experiments across ViT, GLUE, and diffusion-based generation validate the existence and practical utility of a shared weight subspace, delivering substantial speedups, memory savings, and improved edge-deployability for large models. This approach promises scalable, equitable access to large foundations models by lowering both compute and memory barriers for adaptation and inference.

Abstract

The rapid growth of large models has raised concerns about their environmental impact and equity in accessibility due to significant computational costs. Low-Rank Adapters (LoRA) offer a lightweight solution for finetuning large models, resulting in an abundance of publicly available adapters tailored to diverse domains. We ask: Can these pretrained adapters be leveraged to further streamline adaptation to new tasks while addressing these challenges? We introduce EigenLoRAx, a parameter-efficient finetuning method that recycles existing adapters to create a principal subspace aligned with their shared domain knowledge which can be further augmented with orthogonal basis vectors in low-resource scenarios. This enables rapid adaptation to new tasks by learning only lightweight coefficients on the principal components of the subspace-eliminating the need to finetune entire adapters. EigenLoRAx requires significantly fewer parameters and memory, improving efficiency for both training and inference. Our method demonstrates strong performance across diverse domains and tasks, offering a scalable for edge-based applications, personalization, and equitable deployment of large models in resource-constrained environments.

EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference

TL;DR

EigenLoRAx recycles pretrained LoRA adapters to identify a shared, task-invariant principal subspace and enables new-task adaptation by learning lightweight coefficients over this subspace. By aggregating LoRA weights and extracting top PCs, it augments the subspace with pseudo-PCs when data is scarce and uses a fixed subspace during training, drastically reducing trainable parameters while maintaining comparable accuracy across vision and language tasks. Theoretical bounds and extensive experiments across ViT, GLUE, and diffusion-based generation validate the existence and practical utility of a shared weight subspace, delivering substantial speedups, memory savings, and improved edge-deployability for large models. This approach promises scalable, equitable access to large foundations models by lowering both compute and memory barriers for adaptation and inference.

Abstract

The rapid growth of large models has raised concerns about their environmental impact and equity in accessibility due to significant computational costs. Low-Rank Adapters (LoRA) offer a lightweight solution for finetuning large models, resulting in an abundance of publicly available adapters tailored to diverse domains. We ask: Can these pretrained adapters be leveraged to further streamline adaptation to new tasks while addressing these challenges? We introduce EigenLoRAx, a parameter-efficient finetuning method that recycles existing adapters to create a principal subspace aligned with their shared domain knowledge which can be further augmented with orthogonal basis vectors in low-resource scenarios. This enables rapid adaptation to new tasks by learning only lightweight coefficients on the principal components of the subspace-eliminating the need to finetune entire adapters. EigenLoRAx requires significantly fewer parameters and memory, improving efficiency for both training and inference. Our method demonstrates strong performance across diverse domains and tasks, offering a scalable for edge-based applications, personalization, and equitable deployment of large models in resource-constrained environments.

Paper Structure

This paper contains 38 sections, 1 theorem, 16 equations, 13 figures, 15 tables, 1 algorithm.

Key Result

Theorem 3.6

For a task $t_{d+1}$, we assume a hypothesis $h\in\mathcal{H}_{W_{d+1}}$ expressed as $h(W_{d+1},X)=W_{d+1} X_{d+1}+W_0X_{d+1}+b$ where $W_{d+1}$ has rank $m$, $b$ is some constant and $W_0$ represents weights of a pretrained foundation model that is frozen during finetuning respectively. We have $h where $\sigma_i$ are singular values of $\hat{W}$, $C$ is some constant such that ${W^*}_{d+1}=C\ha

Figures (13)

  • Figure 1: LoRA uses low-rank matrices for task-specific finetuning. We observe that LoRA adapters share a principal subspace across task domains. By recycling pretrained adapters, we extract task-invariant principal components, enabling efficient representation of both existing and future LoRAs using compact task-specific coefficients. This improves training speed, parameter efficiency, and memory usage. In low-resource settings, where pretrained adapters are scarce, we augment the subspace with randomly initialized components, ensuring orthogonality via the Gram-Schmidt process, ensuring they complement the extracted subspace without redundancy.
  • Figure 2: The top 16 components contain the most information from a total of 4000+ components for $~500$ LoRAs. ($A$ matrices from layer 1 of Mistral-7b model, Lots of LoRAs, see Section \ref{['sec:lotsofloras']}).
  • Figure 3: Fast Convergence and Better Initialization (left) EigenLoRAx demonstrates faster convergence compared to LoRA and VeRA. EigenLoRAx achieves a speedup of up to $1.5\times$ against LoRA and up to $2\times$ compared to PISSA. This experiment was carried out on the CoLA task of the GLUE benchmark.
  • Figure 4: LoRAs (top) vs. EigenLoRAx (bottom) in Text-to-Image generation. (Left) A single EigenLoRAx analytically reconstructs multiple LoRAs, significantly reducing memory (18$\times$ reduction) and compute costs. (Right) It efficiently learns new tasks with up to 100$\times$ fewer parameters than LoRA, maintaining similar visual quality. See \ref{['appendix:diffusion']} for more examples.
  • Figure 5: Failure Case: EigenLoRAx may fail if an important component is missing from the initialized subspace i.e. the shared subspace is incomplete, which may happen due to inadequacy in the number of initial adapters or due to the majority of the adapters being of bad quality. E.g., the model may have lost the essential "mosaic" property when generating an image for the prompt: "mosaic picture of a dog."
  • ...and 8 more figures

Theorems & Definitions (7)

  • Definition 3.1: Task definition for LoRAs
  • Definition 3.2: Set of LoRA weights
  • Definition 3.3: Subspace spanned by LoRAs from a task domain $\mathcal{T}_{d}$
  • Definition 3.4: Shared principal subspace of LoRAs finetuned in domain $\mathcal{T}_d$
  • Definition 3.5: New related task $t_{d+1}$
  • Theorem 3.6
  • proof