Table of Contents
Fetching ...

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning

Fangwen Wu, Lechao Cheng, Shengeng Tang, Xiaofeng Zhu, Chaowei Fang, Dingwen Zhang, Meng Wang

TL;DR

This work tackles semantic drift in task-agnostic class-incremental learning by targeting two key moments of the feature distribution: the class-conditional means and covariances. It introduces mean shift compensation and covariance calibration to align first- and second-order moments across evolving tasks, leveraging a frozen pre-trained ViT backbone with task-specific LoRA modules and a patch-token self-distillation objective. The approach achieves state-of-the-art results on challenging domain-shift benchmarks (e.g., ImageNet-R, ImageNet-A) while remaining efficient through PEFT and without requiring data replay. Overall, the method enhances stability-plasticity trade-offs in continual learning, offering a scalable solution for robust, task-agnostic CIL.

Abstract

Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones. Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown. To address this, our study reveals that the gap in feature distribution between novel and existing tasks is primarily driven by differences in mean and covariance moments. Building on this insight, we propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration. Specifically, we calculate each class's mean by averaging its sample embeddings and estimate task shifts using weighted embedding changes based on their proximity to the previous mean, effectively capturing mean shifts for all learned classes with each new task. We also apply Mahalanobis distance constraint for covariance calibration, aligning class-specific embedding covariances between old and current networks to mitigate the covariance shift. Additionally, we integrate a feature-level self-distillation approach to enhance generalization. Comprehensive experiments on commonly used datasets demonstrate the effectiveness of our approach. The source code is available at \href{https://github.com/fwu11/MACIL.git}{https://github.com/fwu11/MACIL.git}.

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning

TL;DR

This work tackles semantic drift in task-agnostic class-incremental learning by targeting two key moments of the feature distribution: the class-conditional means and covariances. It introduces mean shift compensation and covariance calibration to align first- and second-order moments across evolving tasks, leveraging a frozen pre-trained ViT backbone with task-specific LoRA modules and a patch-token self-distillation objective. The approach achieves state-of-the-art results on challenging domain-shift benchmarks (e.g., ImageNet-R, ImageNet-A) while remaining efficient through PEFT and without requiring data replay. Overall, the method enhances stability-plasticity trade-offs in continual learning, offering a scalable solution for robust, task-agnostic CIL.

Abstract

Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones. Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown. To address this, our study reveals that the gap in feature distribution between novel and existing tasks is primarily driven by differences in mean and covariance moments. Building on this insight, we propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration. Specifically, we calculate each class's mean by averaging its sample embeddings and estimate task shifts using weighted embedding changes based on their proximity to the previous mean, effectively capturing mean shifts for all learned classes with each new task. We also apply Mahalanobis distance constraint for covariance calibration, aligning class-specific embedding covariances between old and current networks to mitigate the covariance shift. Additionally, we integrate a feature-level self-distillation approach to enhance generalization. Comprehensive experiments on commonly used datasets demonstrate the effectiveness of our approach. The source code is available at \href{https://github.com/fwu11/MACIL.git}{https://github.com/fwu11/MACIL.git}.

Paper Structure

This paper contains 23 sections, 12 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: As new tasks are learned, the categories from previously tasks in the latest updated model continuously experience shifts in their means and variances, referred to as (a) Semantic Drift. In this paper, we calibrate such semantic drift by applying explicit mean shift compensation and implicit variance constraints (b).
  • Figure 2: Illustration of our method at task $t$. The feature extractor at task $t$ uses a frozen pre-trained ViT backbone with learnable LoRA modules. The output class tokens (yellow) are passed through a classifier to compute the classification loss $\mathcal{L}_{\text{cls}}$, and the mean and covariance of each class are stored for each session. During training, class tokens (yellow and blue) are used to align class distributions via a covariance calibration loss $\mathcal{L}_{\text{cov}}$. Patch tokens from network $t$ (yellow) distill knowledge from network $t-1$ (blue) through a distillation loss $\mathcal{L}_{\text{distill}}$. After training, the class means are updated using a mean shift compensation module, and the classifier heads are retrained with the calibrated statistics.
  • Figure 3: The performance of each learning session under different settings of ImageNet-R and CIFAR100. All methods are initialized with ViT-B/16-IN21k. These curves are plotted by calculating the average performance across three different seeds.