Table of Contents
Fetching ...

Continual Learning by Three-Phase Consolidation

Davide Maltoni, Lorenzo Pellegrini

TL;DR

This work tackles catastrophic forgetting in class-incremental continual learning by introducing Three-Phase Consolidation (TPC), a lightweight, replay-friendly scheme combining online bias correction and gradient masking across three learning phases. The method bootstraps novel classes, then jointly updates all seen classes with selective masking, and finally consolidates by balancing all classes, without relying on heavy distillation or complex replay strategies. Empirical results on Core50, ImageNet1000, CIFAR100, and NICv2 show that TPC achieves competitive or superior accuracy with favorable efficiency compared to strong baselines like AR1, BiC, and DER++. The approach is implemented in the Avalanche framework, facilitating reproducibility and practical deployment in real-world continual learning tasks. Overall, TPC demonstrates that a simple, well-structured three-phase update can effectively mitigate bias and forgetting in long sequences with limited data, making it appealing for dynamic environments and privacy-constrained settings.

Abstract

TPC (Three-Phase Consolidation) is here introduced as a simple but effective approach to continually learn new classes (and/or instances of known classes) while controlling forgetting of previous knowledge. Each experience (a.k.a. task) is learned in three phases characterized by different rules and learning dynamics, aimed at removing the class-bias problem (due to class unbalancing) and limiting gradient-based corrections to prevent forgetting of underrepresented classes. Several experiments on complex datasets demonstrate its accuracy and efficiency advantages over competitive existing approaches. The algorithm and all the results presented in this paper are fully reproducible thanks to its publication on the Avalanche open framework for continual learning.

Continual Learning by Three-Phase Consolidation

TL;DR

This work tackles catastrophic forgetting in class-incremental continual learning by introducing Three-Phase Consolidation (TPC), a lightweight, replay-friendly scheme combining online bias correction and gradient masking across three learning phases. The method bootstraps novel classes, then jointly updates all seen classes with selective masking, and finally consolidates by balancing all classes, without relying on heavy distillation or complex replay strategies. Empirical results on Core50, ImageNet1000, CIFAR100, and NICv2 show that TPC achieves competitive or superior accuracy with favorable efficiency compared to strong baselines like AR1, BiC, and DER++. The approach is implemented in the Avalanche framework, facilitating reproducibility and practical deployment in real-world continual learning tasks. Overall, TPC demonstrates that a simple, well-structured three-phase update can effectively mitigate bias and forgetting in long sequences with limited data, making it appealing for dynamic environments and privacy-constrained settings.

Abstract

TPC (Three-Phase Consolidation) is here introduced as a simple but effective approach to continually learn new classes (and/or instances of known classes) while controlling forgetting of previous knowledge. Each experience (a.k.a. task) is learned in three phases characterized by different rules and learning dynamics, aimed at removing the class-bias problem (due to class unbalancing) and limiting gradient-based corrections to prevent forgetting of underrepresented classes. Several experiments on complex datasets demonstrate its accuracy and efficiency advantages over competitive existing approaches. The algorithm and all the results presented in this paper are fully reproducible thanks to its publication on the Avalanche open framework for continual learning.
Paper Structure (21 sections, 12 equations, 2 figures, 6 tables)

This paper contains 21 sections, 12 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Accuracy on the four benchmarks. Each curve is the average over 3 runs (with different ordering of the classes); the run used for hyperparams tuning is excluded. As a common practice in the existing studies, on Core50 benchmarks accuracy is measured on a fixed test set where all classes (also those not yet learned) are present (see core50), while on ImageNet1000 100/10-10 and Cifar100 11/50-5 is measured on an incremental test set including only the classes seen so far. The dashed line (which represents a sort of upper bound) denotes the accuracy of the same model jointly trained on all the data.
  • Figure 2: Accuracy on Core50 41/10-1. TPC baseline (violet) is here compared with TPC and AR1 running without replay memory. AMCA is reported in the legend.