Table of Contents
Fetching ...

GACL: Exemplar-Free Generalized Analytic Continual Learning

Huiping Zhuang, Yizhu Chen, Di Fang, Run He, Kai Tong, Hongxin Wei, Ziqian Zeng, Cen Chen

TL;DR

The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical solution to the GCIL scenario and attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training.

Abstract

Class incremental learning (CIL) trains a network on sequential tasks with separated categories in each task but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution. Existing attempts for the GCIL either have poor performance or invade data privacy by saving exemplars. In this paper, we propose a new exemplar-free GCIL technique named generalized analytic continual learning (GACL). The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical (i.e., closed-form) solution to the GCIL scenario. This solution is derived via decomposing the incoming data into exposed and unexposed classes, thereby attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training. Such an equivalence is crucial in GCIL settings as data distributions among different tasks no longer pose challenges to adopting our GACL. Theoretically, this equivalence property is validated through matrix analysis tools. Empirically, we conduct extensive experiments where, compared with existing GCIL methods, our GACL exhibits a consistently leading performance across various datasets and GCIL settings. Source code is available at https://github.com/CHEN-YIZHU/GACL.

GACL: Exemplar-Free Generalized Analytic Continual Learning

TL;DR

The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical solution to the GCIL scenario and attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training.

Abstract

Class incremental learning (CIL) trains a network on sequential tasks with separated categories in each task but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution. Existing attempts for the GCIL either have poor performance or invade data privacy by saving exemplars. In this paper, we propose a new exemplar-free GCIL technique named generalized analytic continual learning (GACL). The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical (i.e., closed-form) solution to the GCIL scenario. This solution is derived via decomposing the incoming data into exposed and unexposed classes, thereby attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training. Such an equivalence is crucial in GCIL settings as data distributions among different tasks no longer pose challenges to adopting our GACL. Theoretically, this equivalence property is validated through matrix analysis tools. Empirically, we conduct extensive experiments where, compared with existing GCIL methods, our GACL exhibits a consistently leading performance across various datasets and GCIL settings. Source code is available at https://github.com/CHEN-YIZHU/GACL.
Paper Structure (32 sections, 1 theorem, 36 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 32 sections, 1 theorem, 36 equations, 6 figures, 6 tables, 1 algorithm.

Key Result

Theorem 3.1

Let $\bm{\hat{W}}_{\textup{FCN}}^{(k)}$ be the optimal estimation of eq_loss_k with all the training data from task $1$ to task $k$. Then $\bm{\hat{W}}_{\textup{FCN}}^{(k)}$ is equivalent to its recursive form where

Figures (6)

  • Figure 1: An overview of our proposed GACL. (a) Labels of the exposed class and the unexposed class are extracted in each GCIL task (see definition in Section \ref{['section:split']}), respectively. (b) A frozen pre-trained ViT and a buffer layer are utilized to extract features from the inputs. (c) The key to the recursively updated formulation of the GACL contains two components. The $\bm{\hat{W}}_{\textup{unexposed}}^{(k)}$ takes in the contribution of unexposed class data (see \ref{['eq_w_unexposed']}). The other is contributed by the ECLG module $\bm{\hat{W}}_{\textup{ECLG}}^{(k)}$ (e.g., see \ref{['eq_w_eclg']}), which reflects the gain of exposed class data on the seen categories. The recursive formulation flows aided by the autocorrelation memory matrix$\bm{R}$ throughout the GCIL.
  • Figure 2: The task-wise accuracy $\mathcal{A}_k$ of the GACL with EFCIL methods (top) and replay-based methods (bottom) on benchmark datasets with the $K = 5$.
  • Figure 3: A configuration example of Si-Blurry setting.
  • Figure 4: GPU memory consumption in GB with a batch size of 64 where replay-based methods are with 2000 memory size.
  • Figure 5: Real-time accuracy of the GACL on all benchmark datasets with various values of the regularization term $\gamma$.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Theorem 3.1
  • proof
  • proof