Choice of PEFT Technique in Continual Learning: Prompt Tuning is Not All You Need
Martin Wistuba, Prabhu Teja Sivaprasad, Lukas Balles, Giovanni Zappella
TL;DR
The paper interrogates the widespread use of prompt tuning as a PEFT choice in continual learning with pretrained transformers and demonstrates that LoRA-based PEFT variants consistently outperform prompt-based approaches across domain- and class-incremental benchmarks. It introduces drop-in LoRA-based variants of two prominent CL methods, S-Prompts (S-LoRA) and Learning to Prompt (L2L), and shows substantial accuracy gains with minimal inference overhead. Through comprehensive experiments on diverse datasets (CORe50, DomainNet, Split CIFAR-100, Tiny ImageNet) and careful ablations, the authors argue that prompt tuning is not inherently suited to continual learning and advocate adopting LoRA for practical CL deployments. The work emphasizes the importance of ablations of architectural choices, the potential for improved real-world impact, and the need for broader exploration of PEFT techniques beyond prompt tuning in CL.
Abstract
Recent Continual Learning (CL) methods have combined pretrained Transformers with prompt tuning, a parameter-efficient fine-tuning (PEFT) technique. We argue that the choice of prompt tuning in prior works was an undefended and unablated decision, which has been uncritically adopted by subsequent research, but warrants further research to understand its implications. In this paper, we conduct this research and find that the choice of prompt tuning as a PEFT method hurts the overall performance of the CL system. To illustrate this, we replace prompt tuning with LoRA in two state-of-the-art continual learning methods: Learning to Prompt and S-Prompts. These variants consistently achieve higher accuracy across a wide range of domain-incremental and class-incremental benchmarks, while being competitive in inference speed. Our work highlights a crucial argument: unexamined choices can hinder progress in the field, and rigorous ablations, such as the PEFT method, are required to drive meaningful adoption of CL techniques in real-world applications.
