Table of Contents
Fetching ...

Noise-Tolerant Coreset-Based Class Incremental Continual Learning

Edison Mucllari, Aswin Raghavan, Zachary Alan Daniels

TL;DR

This work investigates the robustness of class-incremental continual learning (CIL) when training data are corrupted by label noise or uncorrelated instance noise. It derives a bound for CRUST under additive instance noise and develops two noise-tolerant replay strategies, Continual CRUST and Continual Cosine-CRUST, to build robust replay buffers in a continual setting. Through extensive experiments on five diverse datasets, the authors show that the proposed methods substantially outperform traditional memory-based baselines in terms of final accuracy and forgetting under noise, while preserving high coreset purity. The results highlight the practical potential of noise-tolerant coresets for reliable continual adaptation in vision systems across domains, including medical imaging and remote sensing.

Abstract

Many applications of computer vision require the ability to adapt to novel data distributions after deployment. Adaptation requires algorithms capable of continual learning (CL). Continual learners must be plastic to adapt to novel tasks while minimizing forgetting of previous tasks.However, CL opens up avenues for noise to enter the training pipeline and disrupt the CL. This work focuses on label noise and instance noise in the context of class-incremental learning (CIL), where new classes are added to a classifier over time, and there is no access to external data from past classes. We aim to understand the sensitivity of CL methods that work by replaying items from a memory constructed using the idea of Coresets. We derive a new bound for the robustness of such a method to uncorrelated instance noise under a general additive noise threat model, revealing several insights. Putting the theory into practice, we create two continual learning algorithms to construct noise-tolerant replay buffers. We empirically compare the effectiveness of prior memory-based continual learners and the proposed algorithms under label and uncorrelated instance noise on five diverse datasets. We show that existing memory-based CL are not robust whereas the proposed methods exhibit significant improvements in maximizing classification accuracy and minimizing forgetting in the noisy CIL setting.

Noise-Tolerant Coreset-Based Class Incremental Continual Learning

TL;DR

This work investigates the robustness of class-incremental continual learning (CIL) when training data are corrupted by label noise or uncorrelated instance noise. It derives a bound for CRUST under additive instance noise and develops two noise-tolerant replay strategies, Continual CRUST and Continual Cosine-CRUST, to build robust replay buffers in a continual setting. Through extensive experiments on five diverse datasets, the authors show that the proposed methods substantially outperform traditional memory-based baselines in terms of final accuracy and forgetting under noise, while preserving high coreset purity. The results highlight the practical potential of noise-tolerant coresets for reliable continual adaptation in vision systems across domains, including medical imaging and remote sensing.

Abstract

Many applications of computer vision require the ability to adapt to novel data distributions after deployment. Adaptation requires algorithms capable of continual learning (CL). Continual learners must be plastic to adapt to novel tasks while minimizing forgetting of previous tasks.However, CL opens up avenues for noise to enter the training pipeline and disrupt the CL. This work focuses on label noise and instance noise in the context of class-incremental learning (CIL), where new classes are added to a classifier over time, and there is no access to external data from past classes. We aim to understand the sensitivity of CL methods that work by replaying items from a memory constructed using the idea of Coresets. We derive a new bound for the robustness of such a method to uncorrelated instance noise under a general additive noise threat model, revealing several insights. Putting the theory into practice, we create two continual learning algorithms to construct noise-tolerant replay buffers. We empirically compare the effectiveness of prior memory-based continual learners and the proposed algorithms under label and uncorrelated instance noise on five diverse datasets. We show that existing memory-based CL are not robust whereas the proposed methods exhibit significant improvements in maximizing classification accuracy and minimizing forgetting in the noisy CIL setting.

Paper Structure

This paper contains 31 sections, 11 theorems, 53 equations, 9 figures, 14 tables, 1 algorithm.

Key Result

Theorem 1

(Robustness to Label Flipping) mirzasoleiman2020coresets Assume that gradient descent is applied to train a neural network with mean-squared error (MSE) loss on a dataset with noisy labels. Suppose that the Jacobian of the network is $L$-smooth. Assume that the dataset has label margin of $\delta$,

Figures (9)

  • Figure 1: Representative samples from the datasets used for evaluation, which range in image size (28x28 up to 224x224), sample size (100s to 10,000s), color profile (grayscale, RGB), and modality (handwriting, natural images, SAR aerial imagery, pathology)
  • Figure 2: Looking at the final accuracy at different label flipping noise levels (0.0-0.5) for five benchmark datasets for different strategies; additional results and similar plots for the forgetting metric are included in the supplemental materials.
  • Figure 3: Left: Average Coreset purity over all experiences on PathMNIST+ dataset for different levels of label noise. Right: Examining purity of selected Coresets for each learning experience for PathMNIST+ for label-flipping noise level of 0.5
  • Figure 4: Looking at the final accuracy (left) and forgetting metric (right) at different Coreset sizes (100, 200, and 300 samples per class) for MNIST for random replay, Continual CRUST, and Continual CosineCRUST
  • Figure 5: Looking at the final accuracy (left) and forgetting metric (right) at different label flipping noise levels (0.0-0.5) for MNIST for different strategies
  • ...and 4 more figures

Theorems & Definitions (25)

  • Theorem 1
  • Definition 2
  • Definition 3
  • Theorem 4
  • Definition 5
  • Definition 6
  • Lemma 7
  • proof
  • Definition 8
  • Lemma 9
  • ...and 15 more