FM-LoRA: Factorized Low-Rank Meta-Prompting for Continual Learning
Xiaobing Yu, Jin Yang, Xiao Wu, Peijie Qiu, Xiaofeng Liu
TL;DR
FM-LoRA addresses continual learning under sequential tasks by integrating Factorized Low-Rank Adaptation (F-LoRA), Dynamic Rank Selector (DRS), and Dynamic Meta-Prompting (DMP) to achieve rehearsal-free, parameter-efficient adaptation. F-LoRA confines updates to a shared low-rank subspace using global bases $A_{shared}, B_{shared}$ and task-specific matrices $M_t,N_t$, reducing per-task parameters to $2 r^2$ and limiting interference; DRS selects an effective rank $r_t$ per task based on a complexity measure $H(\\mathcal{T}_t)$ via a Gumbel-Softmax, dynamically matching capacity to task difficulty and similarity; DMP adds a learnable prompt matrix $P$ to stabilize representations across tasks. The combination yields a robust balance of stability and plasticity, with reported SOTA performance on ImageNet-R, CIFAR100, CUB200, and DomainNet under class- and domain-incremental settings, especially as task length grows. This approach reduces memory growth, avoids data rehearsal, and demonstrates strong generalization across diverse tasks and domains, making it practical for continual learning with large pre-trained transformers.
Abstract
How to adapt a pre-trained model continuously for sequential tasks with different prediction class labels and domains and finally learn a generalizable model across diverse tasks is a long-lasting challenge. Continual learning (CL) has emerged as a promising approach to leverage pre-trained models (e.g., Transformers) for sequential tasks. While many existing CL methods incrementally store additional learned structures, such as Low-Rank Adaptation (LoRA) adapters or prompts and sometimes even preserve features from previous samples to maintain performance. This leads to unsustainable parameter growth and escalating storage costs as the number of tasks increases. Moreover, current approaches often lack task similarity awareness, which further hinders the models ability to effectively adapt to new tasks without interfering with previously acquired knowledge. To address these challenges, we propose FM-LoRA, a novel and efficient low-rank adaptation method that integrates both a dynamic rank selector (DRS) and dynamic meta-prompting (DMP). This framework allocates model capacity more effectively across tasks by leveraging a shared low-rank subspace critical for preserving knowledge, thereby avoiding continual parameter expansion. Extensive experiments on various CL benchmarks, including ImageNet-R, CIFAR100, and CUB200 for class-incremental learning (CIL), and DomainNet for domain-incremental learning (DIL), with Transformers backbone demonstrate that FM-LoRA effectively mitigates catastrophic forgetting while delivering robust performance across a diverse range of tasks and domains.
