SPARC: Subspace-Aware Prompt Adaptation for Robust Continual Learning in LLMs
Dinithi Jayasuriya, Sina Tayebati, Davide Ettori, Ranganath Krishnan, Amit Ranjan Trivedi
TL;DR
SPARC addresses continual learning in LLMs by enabling task adaptation through low-dimensional, PCA-derived subspaces inside prompt tuning. The framework uses subspace overlap via cosine similarity to decide prompt reuse and orthogonal initialization to isolate new tasks, while keeping the base model frozen. Only soft prompts in a small subspace are trained, allowing strong forward and backward transfer and compatibility with LoRA. Empirical results on domain- and task-incremental settings show robust knowledge retention (up to 97% prior knowledge retained) and competitive accuracy with minimal parameter updates (as low as 0.04% and 1% with LoRA) across benchmarks like SuperGLUE.
Abstract
We propose SPARC, a lightweight continual learning framework for large language models (LLMs) that enables efficient task adaptation through prompt tuning in a lower-dimensional space. By leveraging principal component analysis (PCA), we identify a compact subspace of the training data. Optimizing prompts in this lower-dimensional space enhances training efficiency, as it focuses updates on the most relevant features while reducing computational overhead. Furthermore, since the model's internal structure remains unaltered, the extensive knowledge gained from pretraining is fully preserved, ensuring that previously learned information is not compromised during adaptation. Our method achieves high knowledge retention in both task-incremental and domain-incremental continual learning setups while fine-tuning only 0.04% of the model's parameters. Additionally, by integrating LoRA, we enhance adaptability to computational constraints, allowing for a tradeoff between accuracy and training cost. Experiments on the SuperGLUE benchmark demonstrate that our PCA-based prompt tuning combined with LoRA maintains full knowledge retention while improving accuracy, utilizing only 1% of the model's parameters. These results establish our approach as a scalable and resource-efficient solution for continual learning in LLMs.
