On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning

Muhammad Ahmad; Jingjing Zheng; Yankai Cao

On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning

Muhammad Ahmad, Jingjing Zheng, Yankai Cao

TL;DR

This work presents an empirical study showing that forgetting is strongly influenced by the geometry and parameterization of the update subspace, highlighting update subspace design as a key factor in continual learning and offering practical guidance for selecting efficient adaptation strategies in sequential settings.

Abstract

Parameter-efficient fine-tuning (PEFT) based on low-rank decomposition, such as LoRA, has become a standard for adapting large pretrained models. However, its behavior in sequential learning -- specifically regarding catastrophic forgetting -- remains insufficiently understood. In this work, we present an empirical study showing that forgetting is strongly influenced by the geometry and parameterization of the update subspace. While methods that restrict updates to small, shared matrix subspaces often suffer from task interference, tensor-based decompositions (e.g., LoRETTA) mitigate forgetting by capturing richer structural information within ultra-compact budgets, and structurally aligned parameterizations (e.g., WeGeFT) preserve pretrained representations. Our findings highlight update subspace design as a key factor in continual learning and offer practical guidance for selecting efficient adaptation strategies in sequential settings.

On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning

TL;DR

Abstract

Paper Structure (9 sections, 3 equations, 2 figures, 2 tables)

This paper contains 9 sections, 3 equations, 2 figures, 2 tables.

Introduction
Methods
Low-rank decomposition-based Fine-tuning
Tasks
Metrics
Experimental Results and Analysis
Conclusion
Hyper-parameter Configuration
Related Work

Figures (2)

Figure 1: Performance comparison of different fine-tuning methods on ViT-Base under varying update space dimensions (rank).
Figure 2: Performance comparison of different fine-tuning methods on ViT-Large under varying update space dimensions (rank).

On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning

TL;DR

Abstract

On Catastrophic Forgetting in Low-Rank Decomposition-Based Parameter-Efficient Fine-Tuning

Authors

TL;DR

Abstract

Table of Contents

Figures (2)