May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

Monica Millunzi; Lorenzo Bonicelli; Angelo Porrello; Jacopo Credi; Petter N. Kolm; Simone Calderara

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

Monica Millunzi, Lorenzo Bonicelli, Angelo Porrello, Jacopo Credi, Petter N. Kolm, Simone Calderara

TL;DR

The paper tackles continual learning with noisy labels in streaming data, where forgetting degrades memory quality. It introduces Alternate Experience Replay (AER) to exploit forgetting and separate clean from noisy/complex samples, and Asymmetric Balanced Sampling (ABS) to maintain current-task purity while preserving informative past samples; a buffer consolidation step using MixMatch further refines the memory. Across multiple benchmarks with synthetic and real noise, the approach yields consistent accuracy gains, notably an average improvement of 4.71 percentage points over loss-based purification baselines, and shows substantial speed advantages over competing online CLN methods. The method demonstrates strong robustness and applicability to online, multi-epoch settings, offering practical benefits for real-world learning with noisy annotations.

Abstract

Forgetting presents a significant challenge during incremental training, making it particularly demanding for contemporary AI systems to assimilate new knowledge in streaming data environments. To address this issue, most approaches in Continual Learning (CL) rely on the replay of a restricted buffer of past data. However, the presence of noise in real-world scenarios, where human annotation is constrained by time limitations or where data is automatically gathered from the web, frequently renders these strategies vulnerable. In this study, we address the problem of CL under Noisy Labels (CLN) by introducing Alternate Experience Replay (AER), which takes advantage of forgetting to maintain a clear distinction between clean, complex, and noisy samples in the memory buffer. The idea is that complex or mislabeled examples, which hardly fit the previously learned data distribution, are most likely to be forgotten. To grasp the benefits of such a separation, we equip AER with Asymmetric Balanced Sampling (ABS): a new sample selection strategy that prioritizes purity on the current task while retaining relevant samples from the past. Through extensive computational comparisons, we demonstrate the effectiveness of our approach in terms of both accuracy and purity of the obtained buffer, resulting in a remarkable average gain of 4.71% points in accuracy with respect to existing loss-based purification strategies. Code is available at https://github.com/aimagelab/mammoth.

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

TL;DR

Abstract

Paper Structure (19 sections, 6 equations, 5 figures, 7 tables, 1 algorithm)

This paper contains 19 sections, 6 equations, 5 figures, 7 tables, 1 algorithm.

Introduction
Related works
Learning with Noisy Labels
Continual Learning under Noisy Labels
Method
Alternate Experience Replay (AER)
Asymmetric Balanced Sampling (ABS)
Buffer consolidation
Experiments
Comparison with State-of-the-Art
Model analysis
Conclusions
On the effectiveness of buffer consolidation
On the influence of the hyperparameter $\alpha$
On the effectiveness of AER as a regularizer for CNL
...and 4 more sections

Figures (5)

Figure 1: Training loss of clean and noisy during the second task of Seq. CIFAR-10 with $40\%$ noise. The loss is computed on examples from the first task stored in the memory buffer. Standard replay makes the two indistinguishable (left) but alternating epochs of replay and forgetting maintain a significant loss separation (right).
Figure 2: Asymmetric Balanced Sampling (ABS). Past examples are chosen to retain the most complex ones, while the criterion is reversed for the current task to maximize purity.
Figure 3: FAA ($[\uparrow]$) of DER++ with our method and buffer fitting.
Figure 4: Final composition of the buffer with different choices of sample selection.
Figure A: Effect of AER on the speed at which the model learns the noisy data

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

TL;DR

Abstract

May the Forgetting Be with You: Alternate Replay for Learning with Noisy Labels

Authors

TL;DR

Abstract

Table of Contents

Figures (5)