Gated Integration of Low-Rank Adaptation for Continual Learning of Large Language Models

Yan-Shuo Liang; Jia-Rui Chen; Wu-Jun Li

Gated Integration of Low-Rank Adaptation for Continual Learning of Large Language Models

Yan-Shuo Liang, Jia-Rui Chen, Wu-Jun Li

TL;DR

GainLoRA tackles catastrophic forgetting in continual learning for large language models by expanding a new LoRA branch per task and introducing per-task gating modules to gate the integration of new and old branches. The method imposes orthogonal initialization and updating constraints on the gating modules to minimize the influence of the new branch on previously learned tasks, without requiring old-task data. Empirical results across diverse task sequences and model sizes show that GainLoRA consistently outperforms state-of-the-art LoRA-based CL methods with only modest overhead, and ablations confirm the necessity of the proposed constraints. This approach enables effective, task-sequence learning for LLMs in inference scenarios where task identifiers are unavailable, with practical implications for scalable continual adaptation of large models.

Abstract

Continual learning (CL), which requires the model to learn multiple tasks sequentially, is crucial for large language models (LLMs). Recently, low-rank adaptation~(LoRA), one of the most representative parameter-efficient fine-tuning (PEFT) methods, has gained increasing attention in CL of LLMs. However, most existing CL methods based on LoRA typically expand a new LoRA branch to learn each new task and force the new and old LoRA branches to influence old tasks equally, potentially leading to forgetting. In this work, we propose a new method, called gated integration of low-rank adaptation (GainLoRA), for CL of LLMs. GainLoRA expands a new LoRA branch for each new task and introduces gating modules to integrate the new and old LoRA branches. Furthermore, GainLoRA leverages the new gating module to minimize the influence from the new LoRA branch to old tasks, effectively mitigating forgetting and improving the model's overall performance. Experimental results on CL benchmarks demonstrate that GainLoRA outperforms existing state-of-the-art methods.

Gated Integration of Low-Rank Adaptation for Continual Learning of Large Language Models

TL;DR

Abstract

Gated Integration of Low-Rank Adaptation for Continual Learning of Large Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (3)