Active Continual Learning: On Balancing Knowledge Retention and Learnability

Thuy-Trang Vu; Shahram Khadivi; Mahsa Ghorbanali; Dinh Phung; Gholamreza Haffari

Active Continual Learning: On Balancing Knowledge Retention and Learnability

Thuy-Trang Vu, Shahram Khadivi, Mahsa Ghorbanali, Dinh Phung, Gholamreza Haffari

TL;DR

This work formalizes active continual learning (ACL), studying how selective annotation within an annotation budget interacts with continual learning (CL) across domain-, class-, and task-incremental settings. It introduces key metrics, notably forgetting rate $FR$ and learning-curve area $LCA$, and proposes the forgetting-learning profile to diagnose when ACL methods trade rapid learning for memory retention. Through extensive experiments on image and text classification, the authors show that ACL with partial labeling can match or approach full-data CL in domain-IL, with experience replay (ER) often providing robust retention, while class-IL remains challenging for ideal ACL placements. The findings offer practical guidelines for selecting AL and CL algorithms in ACL and highlight the need for strategies that balance old-task retention with fast acquisition of new knowledge, especially in non-domain IL scenarios.

Abstract

Acquiring new knowledge without forgetting what has been learned in a sequence of tasks is the central focus of continual learning (CL). While tasks arrive sequentially, the training data are often prepared and annotated independently, leading to the CL of incoming supervised learning tasks. This paper considers the under-explored problem of active continual learning (ACL) for a sequence of active learning (AL) tasks, where each incoming task includes a pool of unlabelled data and an annotation budget. We investigate the effectiveness and interplay between several AL and CL algorithms in the domain, class and task-incremental scenarios. Our experiments reveal the trade-off between two contrasting goals of not forgetting the old knowledge and the ability to quickly learn new knowledge in CL and AL, respectively. While conditioning the AL query strategy on the annotations collected for the previous tasks leads to improved task performance on the domain and task incremental learning, our proposed forgetting-learning profile suggests a gap in balancing the effect of AL and CL for the class-incremental scenario.

Active Continual Learning: On Balancing Knowledge Retention and Learnability

TL;DR

and learning-curve area

, and proposes the forgetting-learning profile to diagnose when ACL methods trade rapid learning for memory retention. Through extensive experiments on image and text classification, the authors show that ACL with partial labeling can match or approach full-data CL in domain-IL, with experience replay (ER) often providing robust retention, while class-IL remains challenging for ideal ACL placements. The findings offer practical guidelines for selecting AL and CL algorithms in ACL and highlight the need for strategies that balance old-task retention with fast acquisition of new knowledge, especially in non-domain IL scenarios.

Abstract

Paper Structure (45 sections, 1 equation, 16 figures, 9 tables, 1 algorithm)

This paper contains 45 sections, 1 equation, 16 figures, 9 tables, 1 algorithm.

Introduction
Knowledge Retention and Quick Learnability
Knowledge Retention in Continual Learning
Continual Learning
Forgetting Rate
Quick Learnablity in Active Learning
Quick Learnability
Active Continual Learning
Forgetting-Learning Profile
Experiments
Dataset
Continual Learning Methods
Active Learning Methods
Ceiling Methods
Model Training and Hyperparameters
...and 30 more sections

Figures (16)

Figure 1: (a) Active continual learning (ACL) annotates training data sequentially by conditioning on the learning dynamic of the current model (red arrow). (b) Forgetting-Learning Profile to visualize the balance between old knowledge retention and new knowledge learning in ACL. An ideal ACL method should lie at the quick learner with low forgetting rate region.
Figure 2: The ceiling methods of ACL.
Figure 3: Relative performance (average accuracy of 6 runs) of various ACL methods with respect to the CL on full labelled data (full CL) in image classification benchmarks. The error bar indicates the standard deviation of the difference between two means, full CL and ACL. iCaRL is a class-IL method, hence not applicable for P-MNIST dataset.
Figure 4: Relative performance (average accuracy of 6 runs) of various ACL methods with respect to the CL on full labelled data in text classification tasks. The error bar indicates the standard deviation of the difference between two means, full CL and ACL.
Figure 5: Learning-Forgetting profile of ACL methods with entropy in text classification tasks.
...and 11 more figures

Active Continual Learning: On Balancing Knowledge Retention and Learnability

TL;DR

Abstract

Active Continual Learning: On Balancing Knowledge Retention and Learnability

Authors

TL;DR

Abstract

Table of Contents

Figures (16)