ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition

Xindian Ma; Rundong Kong; Peng Zhang; Ruoxiang Huang; Yongyu Jiang

ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition

Xindian Ma, Rundong Kong, Peng Zhang, Ruoxiang Huang, Yongyu Jiang

TL;DR

In multi-task scenarios, ID-LoRA surpasses LoRA and its recent variants on both Code and MMLU tasks, yet requires only 54% of the trainable parameters demanded by the conventional LoRA.

Abstract

LoRA has become a universal Parameter-Efficient Fine-Tuning (PEFT) technique that equips Large Language Models (LLMs) to adapt quickly to new tasks. However, when these models are scaled up, even the latest LoRA variants still introduce considerable overhead in trainable parameters. Conversely, aggressively lowering the rank to curb this overhead markedly degrades performance in complex multi-task settings. We propose ID-LoRA, a novel PEFT framework that breaks the trade-off. Its core innovation lies in extracting and reusing clustered parameter groups from the pretrained weight matrix. These groups are then used to form multiple low-rank components, all of which share only a single initialized trainable low-rank matrix. This approach cuts the number of trainable parameters while keeping the model's capacity intact. We evaluate ID-LoRA on five diverse benchmarks: Mathematical Reasoning, Code Generation, MMLU, CommonsenseQA, and Safety Alignment. ID-LoRA outperforms both full fine-tuning and existing PEFT baselines (e.g., LoRA, DoRA, HydraLoRA) while using up to 46% fewer trainable parameters than the standard LoRA. In multi-task scenarios, it surpasses LoRA and its recent variants (e.g., DoRA and HydraLoRA) on both Code and MMLU tasks, yet requires only 54% of the trainable parameters demanded by the conventional LoRA.

ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition

TL;DR

In multi-task scenarios, ID-LoRA surpasses LoRA and its recent variants on both Code and MMLU tasks, yet requires only 54% of the trainable parameters demanded by the conventional LoRA.

Abstract

Paper Structure (26 sections, 4 theorems, 32 equations, 4 figures, 8 tables)

This paper contains 26 sections, 4 theorems, 32 equations, 4 figures, 8 tables.

Introduction
Related Work
Preliminaries
Low Rank Adaptation
Matrix Interpolative Decomposition
Method
Parameter Matrix Row Clustering
Rank Boosting
Theoretical Analysis of Clustered Decomposition
Experiments
Experimental Setup
Experiments on Single-Task
Experiments on Multi-Task
Efficiency Analysis
Ablation Analysis
...and 11 more sections

Key Result

Theorem 1

Under Assumptions 1 and 2, the clustering-aware decomposition achieves a tighter reconstruction error bound compared to global low rank decomposition: where $\Delta = \sum_{l=1}^k \sum_{i \in \mathcal{C}_l} \|B(A_{l(i)} - A^{\text{global}}) \|_F^2 \geq 0$. The inequality becomes strict ($\Delta > 0$) when tasks exhibit clustering structure.

Figures (4)

Figure 1: Architectural Comparison and Parameter Efficiency: LoRA versus ID-LoRA. (a) LoRA requires training two low-rank matrices: randomly initialized $A \in \mathbb{R}^{r\times d}$ and zero-initialized $B \in \mathbb{R}^{d\times r}$. (b) ID-LoRA employs the parameter clustering and rank boosting to generate multiple low-rank components while sharing a single B, thereby reducing trainable parameters. (c) Trainable parameters: ID-LoRA achieves $\sim 5\times$ compression versus LoRA at rank $32$ (right) and maintains superior scalability across model sizes (left).
Figure 2: A diagram of the ID-LoRA architecture.
Figure 3: The inference time and extra memory overhead of different adaptation methods under the same hyperparameter settings as the multi-task experiments on LLaMA-3-8B, tested on an A800 GPU.
Figure 4: Performance comparison between ID-LoRA and vanilla LoRA on four representative benchmarks (GSM8K, HumanEval, MMLU, and CommonsenseQA) using LLaMA3-8B as the backbone. Both methods vary the rank while keeping the number of trainable parameters approximately equal, and results are reported under few-shot or zero-shot settings. Detailed results are provided in Appendix \ref{['sec: Experimental Details of Parameter-Parity Performance']}

Theorems & Definitions (10)

Definition 1: Pivot Sensitivity
Theorem 1: Clustering Reconstruction
Theorem 2: Cluster-Pivot Stability
Definition 2: Pivot Sensitivity
Definition 3: Task Parameter Distance
Definition 4: Cluster Low-Rank Decomposition
Definition 5: Multi-Task Reconstruction
Definition 6: CUR Decomposition
Theorem 3: Clustering Reconstruction
Theorem 4: Cluster-Pivot Stability

ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition

TL;DR

Abstract

ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (10)