Table of Contents
Fetching ...

PromptFusion: Decoupling Stability and Plasticity for Continual Learning

Haoran Chen, Zuxuan Wu, Xintong Han, Menglin Jia, Yu-Gang Jiang

TL;DR

PromptFusion tackles the stability-plasticity dilemma in continual learning by decoupling stability and plasticity into two prompt-tuning modules: a Stabilizer (CoOp) and a Booster (VPT). The outputs are fused with a learnable weight and a memory-aware mask, and an efficiency variant, PromptFusion-Lite, gates Booster usage on a per-input basis using Gumbel-Softmax. The approach achieves state-of-the-art results on class-incremental and domain-incremental benchmarks, including strong gains on Split-Imagenet-R in memory-free settings and reduced computation with PromptFusion-Lite. By evidencing dataset-dependent strengths of each module and a practical gating mechanism, the work demonstrates that specializing architectures for stability vs. plasticity and combining them can yield robust, efficient continual learning across memory regimes.

Abstract

Current research on continual learning mainly focuses on relieving catastrophic forgetting, and most of their success is at the cost of limiting the performance of newly incoming tasks. Such a trade-off is referred to as the stability-plasticity dilemma and is a more general and challenging problem for continual learning. However, the inherent conflict between these two concepts makes it seemingly impossible to devise a satisfactory solution to both of them simultaneously. Therefore, we ask, "is it possible to divide them into two separate problems to conquer them independently?". To this end, we propose a prompt-tuning-based method termed PromptFusion to enable the decoupling of stability and plasticity. Specifically, PromptFusion consists of a carefully designed \stab module that deals with catastrophic forgetting and a \boo module to learn new knowledge concurrently. Furthermore, to address the computational overhead brought by the additional architecture, we propose PromptFusion-Lite which improves PromptFusion by dynamically determining whether to activate both modules for each input image. Extensive experiments show that both PromptFusion and PromptFusion-Lite achieve promising results on popular continual learning datasets for class-incremental and domain-incremental settings. Especially on Split-Imagenet-R, one of the most challenging datasets for class-incremental learning, our method can exceed state-of-the-art prompt-based methods by more than 5\% in accuracy, with PromptFusion-Lite using 14.8\% less computational resources than PromptFusion.

PromptFusion: Decoupling Stability and Plasticity for Continual Learning

TL;DR

PromptFusion tackles the stability-plasticity dilemma in continual learning by decoupling stability and plasticity into two prompt-tuning modules: a Stabilizer (CoOp) and a Booster (VPT). The outputs are fused with a learnable weight and a memory-aware mask, and an efficiency variant, PromptFusion-Lite, gates Booster usage on a per-input basis using Gumbel-Softmax. The approach achieves state-of-the-art results on class-incremental and domain-incremental benchmarks, including strong gains on Split-Imagenet-R in memory-free settings and reduced computation with PromptFusion-Lite. By evidencing dataset-dependent strengths of each module and a practical gating mechanism, the work demonstrates that specializing architectures for stability vs. plasticity and combining them can yield robust, efficient continual learning across memory regimes.

Abstract

Current research on continual learning mainly focuses on relieving catastrophic forgetting, and most of their success is at the cost of limiting the performance of newly incoming tasks. Such a trade-off is referred to as the stability-plasticity dilemma and is a more general and challenging problem for continual learning. However, the inherent conflict between these two concepts makes it seemingly impossible to devise a satisfactory solution to both of them simultaneously. Therefore, we ask, "is it possible to divide them into two separate problems to conquer them independently?". To this end, we propose a prompt-tuning-based method termed PromptFusion to enable the decoupling of stability and plasticity. Specifically, PromptFusion consists of a carefully designed \stab module that deals with catastrophic forgetting and a \boo module to learn new knowledge concurrently. Furthermore, to address the computational overhead brought by the additional architecture, we propose PromptFusion-Lite which improves PromptFusion by dynamically determining whether to activate both modules for each input image. Extensive experiments show that both PromptFusion and PromptFusion-Lite achieve promising results on popular continual learning datasets for class-incremental and domain-incremental settings. Especially on Split-Imagenet-R, one of the most challenging datasets for class-incremental learning, our method can exceed state-of-the-art prompt-based methods by more than 5\% in accuracy, with PromptFusion-Lite using 14.8\% less computational resources than PromptFusion.
Paper Structure (14 sections, 9 equations, 9 figures, 5 tables)

This paper contains 14 sections, 9 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Performance on the most recently learned task on the Split-Cifar100 dataset. Here, $R_{T,i}$ is the test accuracy on task $i$ after learning task $T$. As is shown, the plasticity of L2P learningtoprompt and DualPrompt wang2022dualprompt is limited.
  • Figure 2: Stability comparison between CoOp and VPT. Accuracy curves for tasks $T_2$, $T_4$, and $T_5$ using the two modules are presented. All three graphs show a similar trend where performance degradation in CoOp is much smaller than that in VPT. This shows that CoOp is much more robust against forgetting than VPT.
  • Figure 3: KDE analysis on task $T_1$.
  • Figure 4: Plasticity comparison between CoOp and VPT, where they exhibit opposite patterns.
  • Figure 5: Comparison between VPT and VPT with weights from CoOp's visual encoder, i.e., CLIP.
  • ...and 4 more figures