Table of Contents
Fetching ...

TaE: Task-aware Expandable Representation for Long Tail Class Incremental Learning

Linjie Li, Zhenyu Wu, Jiaming Liu, Yang Ji

TL;DR

A novel Task-aware Expandable (TaE) framework is introduced, dynamically allocating and updating task-specific trainable parameters to learn diverse representations from each incremental task while resisting forgetting through the majority of frozen model parameters.

Abstract

Class-incremental learning is dedicated to the development of deep learning models that are capable of acquiring new knowledge while retaining previously learned information. Most methods focus on balanced data distribution for each task, overlooking real-world long-tailed distributions. Therefore, Long-Tailed Class-Incremental Learning has been introduced, which trains on data where head classes have more samples than tail classes. Existing methods mainly focus on preserving representative samples from previous classes to combat catastrophic forgetting. Recently, dynamic network algorithms freeze old network structures and expand new ones, achieving significant performance. However, with the introduction of the long-tail problem, merely extending Determined blocks can lead to miscalibrated predictions, while expanding the entire backbone results in an explosion of memory size. To address these issues, we introduce a novel Task-aware Expandable (TaE) framework, dynamically allocating and updating task-specific trainable parameters to learn diverse representations from each incremental task while resisting forgetting through the majority of frozen model parameters. To further encourage the class-specific feature representation, we develop a Centroid-Enhanced (CEd) method to guide the update of these task-aware parameters. This approach is designed to adaptively allocate feature space for every class by adjusting the distance between intra- and inter-class features, which can extend to all "training from sketch" algorithms. Extensive experiments demonstrate that TaE achieves state-of-the-art performance.

TaE: Task-aware Expandable Representation for Long Tail Class Incremental Learning

TL;DR

A novel Task-aware Expandable (TaE) framework is introduced, dynamically allocating and updating task-specific trainable parameters to learn diverse representations from each incremental task while resisting forgetting through the majority of frozen model parameters.

Abstract

Class-incremental learning is dedicated to the development of deep learning models that are capable of acquiring new knowledge while retaining previously learned information. Most methods focus on balanced data distribution for each task, overlooking real-world long-tailed distributions. Therefore, Long-Tailed Class-Incremental Learning has been introduced, which trains on data where head classes have more samples than tail classes. Existing methods mainly focus on preserving representative samples from previous classes to combat catastrophic forgetting. Recently, dynamic network algorithms freeze old network structures and expand new ones, achieving significant performance. However, with the introduction of the long-tail problem, merely extending Determined blocks can lead to miscalibrated predictions, while expanding the entire backbone results in an explosion of memory size. To address these issues, we introduce a novel Task-aware Expandable (TaE) framework, dynamically allocating and updating task-specific trainable parameters to learn diverse representations from each incremental task while resisting forgetting through the majority of frozen model parameters. To further encourage the class-specific feature representation, we develop a Centroid-Enhanced (CEd) method to guide the update of these task-aware parameters. This approach is designed to adaptively allocate feature space for every class by adjusting the distance between intra- and inter-class features, which can extend to all "training from sketch" algorithms. Extensive experiments demonstrate that TaE achieves state-of-the-art performance.
Paper Structure (22 sections, 6 equations, 7 figures, 6 tables, 1 algorithm)

This paper contains 22 sections, 6 equations, 7 figures, 6 tables, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of Conventional and Shuffled distribution.
  • Figure 2: Parameter-performance comparison of different dynamic network methods on ImageNet100-LT B0-10steps. TaE only expands a few training parameters to exceed the SOTA CIL method.
  • Figure 3: The overview of TaE. During training for the $n$ task, the training set undergoes multiple forward passes on the model from the previous round. It selects the most sensitive $p\%$ of gradients associated with these parameters, expands these selected parameters, and freezes the others. The learning process is guided by the Centroid-Enhanced (CEd) method: (a) each class learns a centroid updated throughout training; (b) centroids between different classes remain distant; (c) features within the same class converge towards the centroid. This process is governed by the $\mathcal{L}_{max-min}$ loss. The Re-weight strategy trains classifier learning.
  • Figure 4: The performance for each step. The model is trained on CIFAR100-LT $\rho$ = 0.1 with three Datasets Protocols.
  • Figure 5: The performance for each step. The model is trained on ImageNet100-LT B0-10steps $\rho$ = 0.1.
  • ...and 2 more figures