ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Huandong Chang; Zicheng Ma; Mingyuan Ma; Zhenting Qi; Andrew Sabot; Hong Jiang; H. T. Kung

ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

Huandong Chang, Zicheng Ma, Mingyuan Ma, Zhenting Qi, Andrew Sabot, Hong Jiang, H. T. Kung

TL;DR

ElaLoRA tackles the inefficiency of fine-tuning large models by introducing a fully adaptive low-rank adaptation framework that prunes and expands LoRA ranks during training. It builds on an SVD-based parameterization $W = W^{(0)} + P \Lambda Q$, guided by gradient-based importance scores $s(w) = \left| w \frac{\partial L}{\partial w} \right|$ and stabilized by EMA, enabling selective allocation of capacity to the most impactful layers. The method comprises a three-phase learning schedule (warm-up, dynamic adjustment, stabilization) plus a dynamic rank scheduler, and it is the first to enable both pruning and expansion of ranks simultaneously. Across GLUE, XSum, and VTAB benchmarks, ElaLoRA outperforms fixed-rank LoRA and AdaLoRA under various budgets, with analyses showing that high-rank allocations align with the most task-relevant components, offering a scalable, resource-efficient path for PEFT in constrained environments.

Abstract

Low-Rank Adaptation (LoRA) has become a widely adopted technique for fine-tuning large-scale pre-trained models with minimal parameter updates. However, existing methods rely on fixed ranks or focus solely on either rank pruning or expansion, failing to adapt ranks dynamically to match the importance of different layers during training. In this work, we propose ElaLoRA, an adaptive low-rank adaptation framework that dynamically prunes and expands ranks based on gradient-derived importance scores. To the best of our knowledge, ElaLoRA is the first method that enables both rank pruning and expansion during fine-tuning. Experiments across multiple benchmarks demonstrate that ElaLoRA consistently outperforms existing PEFT methods across different parameter budgets. Furthermore, our studies validate that layers receiving higher rank allocations contribute more significantly to model performance, providing theoretical justification for our adaptive strategy. By introducing a principled and adaptive rank allocation mechanism, ElaLoRA offers a scalable and efficient fine-tuning solution, particularly suited for resource-constrained environments.

ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

TL;DR

Abstract

ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)