Table of Contents
Fetching ...

Orthogonal Low-rank Adaptation in Lie Groups for Continual Learning of Large Language Models

Kefan Cao, Shuaicheng Wu

TL;DR

The paper tackles catastrophic forgetting in sequential fine-tuning of large language models (LLMs) by introducing OLieRA, a Lie-group–based continual learning framework that applies multiplicative updates $W \odot \exp(\Delta W)$ and enforces full-subspace orthogonality to preserve parameter geometry.OLieRA builds on low-rank adaptation (LoRA) while embedding updates in a Lie group and conducting them in the corresponding Lie algebra, enabling structure-preserving updates and interpretability via Hadamard-based operations and Taylor approximations of the exponential map.Empirical results show state-of-the-art performance on the Standard CL benchmark and competitive results on long-task sequences, with replay-free training and privacy-friendly inference, while Fisher-information analyses suggest updates that meaningfully interact with sensitive directions rather than avoiding them entirely.Overall, OLieRA provides a principled, efficient approach to continual learning for LLMs, unifying geometric parameter preservation with orthogonality constraints and low-rank updates to mitigate forgetting across many tasks.

Abstract

Large language models (LLMs) suffer from catastrophic forgetting in sequential multi-task learning. Existing parameter regularization methods (e.g., O-LoRA, N-LoRA) mitigate interference via low-rank subspace orthogonality, but additive updates distort the intrinsic geometry of model parameters. We propose \textbf{OLieRA}, a Lie group based fine-tuning framework that preserves parameter geometry through multiplicative updates while enforcing orthogonality across task subspaces. OLieRA achieves state-of-the-art performance on the Standard CL benchmark and remains highly competitive under large task sequences. It further inherits the replay-free and task-ID free inference properties of O-LoRA, establishing a principled paradigm for continual learning in LLMs.

Orthogonal Low-rank Adaptation in Lie Groups for Continual Learning of Large Language Models

TL;DR

The paper tackles catastrophic forgetting in sequential fine-tuning of large language models (LLMs) by introducing OLieRA, a Lie-group–based continual learning framework that applies multiplicative updates $W \odot \exp(\Delta W)$ and enforces full-subspace orthogonality to preserve parameter geometry.OLieRA builds on low-rank adaptation (LoRA) while embedding updates in a Lie group and conducting them in the corresponding Lie algebra, enabling structure-preserving updates and interpretability via Hadamard-based operations and Taylor approximations of the exponential map.Empirical results show state-of-the-art performance on the Standard CL benchmark and competitive results on long-task sequences, with replay-free training and privacy-friendly inference, while Fisher-information analyses suggest updates that meaningfully interact with sensitive directions rather than avoiding them entirely.Overall, OLieRA provides a principled, efficient approach to continual learning for LLMs, unifying geometric parameter preservation with orthogonality constraints and low-rank updates to mitigate forgetting across many tasks.

Abstract

Large language models (LLMs) suffer from catastrophic forgetting in sequential multi-task learning. Existing parameter regularization methods (e.g., O-LoRA, N-LoRA) mitigate interference via low-rank subspace orthogonality, but additive updates distort the intrinsic geometry of model parameters. We propose \textbf{OLieRA}, a Lie group based fine-tuning framework that preserves parameter geometry through multiplicative updates while enforcing orthogonality across task subspaces. OLieRA achieves state-of-the-art performance on the Standard CL benchmark and remains highly competitive under large task sequences. It further inherits the replay-free and task-ID free inference properties of O-LoRA, establishing a principled paradigm for continual learning in LLMs.

Paper Structure

This paper contains 31 sections, 20 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: Illustration of the update mechanisms of OLieRA. While LoRA updates parameters, it overlooks the intrinsic parameter structure. In contrast, OLieRA incorporates a Lie group constraint in addition to orthogonality, thereby preserving the original parameter structure.
  • Figure 2: The OLieRA framework for continual learning in language models. First, human expertise is integrated and generalization is enhanced through instruction tuning. Second, building upon a frozen Transformer-based pre-trained language model, we leverage the exponential map of Lie groups to achieve "smooth manifold" parameter updates while preserving the intrinsic structure of the model. Then, we approximate the gradient subspace of each task using LoRA to efficiently control parameter overhead. For each sequentially arriving task, we incrementally learn a new LoRA module under both the Lie group constraint and an orthogonality constraint between the current task's LoRA subspace and all previous task subspaces, thereby reducing inter-task interference and preserving generalization to unseen tasks.

Theorems & Definitions (1)

  • Definition 1: Non-parametric Conflict