Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation

Wenxuan Zhang; Youssef Mohamed; Bernard Ghanem; Philip H. S. Torr; Adel Bibi; Mohamed Elhoseiny

Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation

Wenxuan Zhang, Youssef Mohamed, Bernard Ghanem, Philip H. S. Torr, Adel Bibi, Mohamed Elhoseiny

TL;DR

This work introduces Continual Learning on a Diet, a budgeted semi-supervised CL setting with per-step compute constraints and sparse labels, and proposes DietCL as a simple, effective baseline that jointly learns from labeled and unlabeled data. DietCL allocates computation across unlabeled data, current labeled data, and a balanced buffer of past labeled data, using an MAE-based SSL term and a masked classification loss to prevent forgetting. Across large-scale benchmarks ImageNet10k, CLOC, and CGLM, DietCL outperforms both supervised CL and recent semi-supervised CL methods under the same budget, with gains of a few percentage points and robust behavior across varying budgets, label rates, and stream lengths. The results highlight the practical value of leveraging unlabeled data under tight compute budgets to improve generalization and stability in real-world continual learning scenarios.

Abstract

We propose and study a realistic Continual Learning (CL) setting where learning algorithms are granted a restricted computational budget per time step while training. We apply this setting to large-scale semi-supervised Continual Learning scenarios with sparse label rates. Previous proficient CL methods perform very poorly in this challenging setting. Overfitting to the sparse labeled data and insufficient computational budget are the two main culprits for such a poor performance. Our new setting encourages learning methods to effectively and efficiently utilize the unlabeled data during training. To that end, we propose a simple but highly effective baseline, DietCL, which utilizes both unlabeled and labeled data jointly. DietCL meticulously allocates computational budget for both types of data. We validate our baseline, at scale, on several datasets, e.g., CLOC, ImageNet10K, and CGLM, under constraint budget setups. DietCL outperforms, by a large margin, all existing supervised CL algorithms as well as more recent continual semi-supervised methods. Our extensive analysis and ablations demonstrate that DietCL is stable under a full spectrum of label sparsity, computational budget, and various other ablations.

Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation

TL;DR

Abstract

Paper Structure (23 sections, 4 equations, 7 figures, 7 tables, 1 algorithm)

This paper contains 23 sections, 4 equations, 7 figures, 7 tables, 1 algorithm.

Introduction
Related Work
Continual Learning on a Diet
Problem Formulation
Opportunities for Improvement
Proposed Solution: DietCL
Experiments
Experiment setup
Main results
Ablating Equation \ref{['eq:objective']}
Computational Budget and Label Rate
Conclusions
Semantic code of the algorithm
More details of Experiment
Benchmark Statistics
...and 8 more sections

Figures (7)

Figure 1: DietCL considers the computation budget due to effective computational time restrictions and very sparse label rate due to annotation cost. At each time step, we propose to allocate sufficient computation for labeled data and utilize the diverse unlabeled data with remaining computation to migrate the overfitting.
Figure 2: Average accuracy of ER, CaSSLe, and DietCL on 1% labeled ImageNet10k with varying computational steps. Left: supervised method, ER, starts to overfit after 400 steps. Right: semi-supervised method, CaSSLe, converges slowly. DietCL converges fast and alleviates overfitting.
Figure 3: Accuracy at each time step of the baselines on ImageNet10k, CLOC, and CGLM dataset. Our algorithm surpasses the supervised methods by using unlabeled data and outperforms semi-supervised methods due to effective allocation of budgets.
Figure 4: Varying the Computation per Time Step. Accuracy of DietCL, CaSSLe, and GDumb with different computational steps at each time step in 1% ImageNet10k. The top right boxes show the average accuracies of DietCL in corresponding computational step settings.
Figure 7: Validation Loss and Accuracy of classes introduced from time step 0,1,2 during the training of time step 2. The training is conducted on ImageNet10k, with label rate 0.01 and a budget of 300 steps.
...and 2 more figures

Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation

TL;DR

Abstract

Continual Learning on a Diet: Learning from Sparsely Labeled Streams Under Constrained Computation

Authors

TL;DR

Abstract

Table of Contents

Figures (7)