Table of Contents
Fetching ...

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

Jiashuo Li, Shaokun Wang, Bo Qian, Yuhang He, Xing Wei, Qiang Wang, Yihong Gong

TL;DR

This paper tackles non-exemplar class incremental learning (NECIL), where models must continually acquire new classes without storing old exemplars, which amplifies forgetting and classifier drift. It introduces Dynamic Integration of task-specific Adapters (DIA), combining Task-Specific Adapter Integration (TSAI) for patch-level compositionality with Patch-Level Model Alignment (PDL and PFR) to preserve feature consistency and realign decision boundaries. Empirically, DIA achieves state-of-the-art results across four NECIL benchmarks with substantial reductions in computation and parameter cost, demonstrating robust knowledge retention and efficient adaptation to new tasks. The work offers a practical approach for privacy-preserving continual learning with scalable patch-level adaptation and principled alignment mechanisms.

Abstract

Non-exemplar class Incremental Learning (NECIL) enables models to continuously acquire new classes without retraining from scratch and storing old task exemplars, addressing privacy and storage issues. However, the absence of data from earlier tasks exacerbates the challenge of catastrophic forgetting in NECIL. In this paper, we propose a novel framework called Dynamic Integration of task-specific Adapters (DIA), which comprises two key components: Task-Specific Adapter Integration (TSAI) and Patch-Level Model Alignment. TSAI boosts compositionality through a patch-level adapter integration strategy, which provides a more flexible compositional solution while maintaining low computation costs. Patch-Level Model Alignment maintains feature consistency and accurate decision boundaries via two specialized mechanisms: Patch-Level Distillation Loss (PDL) and Patch-Level Feature Reconstruction method (PFR). Specifically, the PDL preserves feature-level consistency between successive models by implementing a distillation loss based on the contributions of patch tokens to new class learning. The PFR facilitates accurate classifier alignment by reconstructing old class features from previous tasks that adapt to new task knowledge. Extensive experiments validate the effectiveness of our DIA, revealing significant improvements on benchmark datasets in the NECIL setting, maintaining an optimal balance between computational complexity and accuracy.

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

TL;DR

This paper tackles non-exemplar class incremental learning (NECIL), where models must continually acquire new classes without storing old exemplars, which amplifies forgetting and classifier drift. It introduces Dynamic Integration of task-specific Adapters (DIA), combining Task-Specific Adapter Integration (TSAI) for patch-level compositionality with Patch-Level Model Alignment (PDL and PFR) to preserve feature consistency and realign decision boundaries. Empirically, DIA achieves state-of-the-art results across four NECIL benchmarks with substantial reductions in computation and parameter cost, demonstrating robust knowledge retention and efficient adaptation to new tasks. The work offers a practical approach for privacy-preserving continual learning with scalable patch-level adaptation and principled alignment mechanisms.

Abstract

Non-exemplar class Incremental Learning (NECIL) enables models to continuously acquire new classes without retraining from scratch and storing old task exemplars, addressing privacy and storage issues. However, the absence of data from earlier tasks exacerbates the challenge of catastrophic forgetting in NECIL. In this paper, we propose a novel framework called Dynamic Integration of task-specific Adapters (DIA), which comprises two key components: Task-Specific Adapter Integration (TSAI) and Patch-Level Model Alignment. TSAI boosts compositionality through a patch-level adapter integration strategy, which provides a more flexible compositional solution while maintaining low computation costs. Patch-Level Model Alignment maintains feature consistency and accurate decision boundaries via two specialized mechanisms: Patch-Level Distillation Loss (PDL) and Patch-Level Feature Reconstruction method (PFR). Specifically, the PDL preserves feature-level consistency between successive models by implementing a distillation loss based on the contributions of patch tokens to new class learning. The PFR facilitates accurate classifier alignment by reconstructing old class features from previous tasks that adapt to new task knowledge. Extensive experiments validate the effectiveness of our DIA, revealing significant improvements on benchmark datasets in the NECIL setting, maintaining an optimal balance between computational complexity and accuracy.
Paper Structure (14 sections, 12 equations, 4 figures, 5 tables)

This paper contains 14 sections, 12 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: (a) Shared parameter space leads to task interference. (b) High computation costs characterize current PET-based methods. (c) Gaussian-based feature reconstruction may have a significant deviation from the actual feature distribution. $c_i$ indicates the actual feature distribution of old classes, $\hat{c_i}$ indicates the feature distribution generated by Gaussian Sampling.
  • Figure 2: Illustration of DIA. (a) Task-Specific Adapter Integration. For incremental task $t$, we learn a task adapter $\mathcal{A}^{t,b}$ and a task signature vector $\bm{\tau}^{t,b} \in \mathcal{R}^d$ at each transformer block $b$. Each token is routed to the relevant task adapters through the signature vectors, processed independently, and combined into an integrated, task-informed output. (b) Patch-level Distillation. We promote feature drift in patch tokens that contribute to new task learning while penalizing those that do not, thus regulating the feature shift associated with old tasks. (c) Patch-level Feature Reconstruction. We identify patch tokens that are related to old class knowledge and integrate them with the old class prototype $\bm{\mu}_{k}^{t-1}$ to reconstruct old class feature $\hat{\bm{\mu}}_{k}^{t-1}$ aligned with new tasks.
  • Figure 3: Visualization of different feature reconstruction methods. Dark-colored triangles represent actual features, while light-colored circles denote pseudo features generated by feature reconstruction (FR) methods. The decision boundaries formed by these pseudo features are also visualized through the background color. $RE^{Gau}$ and $RE^{PFR}$ indicate regions with inaccurate decision boundaries. The features reconstructed by PFR closely align with the actual features of old classes, resulting in more accurate decision boundaries.
  • Figure 4: Ablation experiments on hyperparameters $\beta$ and $\lambda$ conducted on the CIFAR-100 dataset.