Prior-free Balanced Replay: Uncertainty-guided Reservoir Sampling for Long-Tailed Continual Learning
Lei Liu, Li Liu, Yawen Cui
TL;DR
This work tackles catastrophic forgetting in long-tailed continual learning by proposing Prior-free Balanced Replay (PBR), which reframes replay as an uncertainty-guided reservoir sampling problem to preferentially store minority samples without relying on prior label distributions. It introduces two prior-free constraints—Prototype Constraint via cosine-normalized prototypes and Boundary Constraint via uncertainty-aware distillation—to preserve class prototypes and task boundaries across incremental steps. The combined approach uses Monte Carlo dropout-based mutual information to identify boundary-supporting samples and maintains a balanced memory through principled sample-in/out rules, achieving state-of-the-art results on Seq-CIFAR-10-LT, Seq-CIFAR-100-LT, and Seq-TinyImageNet-LT in both ordered- and shuffled-LTCL settings. Empirical results show substantial gains for minority classes, robustness to buffer size, and clear ablation support for the utility of the cosine prototype classifier and boundary-focused replay. The proposed framework offers a practical, prior-information-free solution for LTCL with real-world data streams where task distributions are not known in advance.
Abstract
Even in the era of large models, one of the well-known issues in continual learning (CL) is catastrophic forgetting, which is significantly challenging when the continual data stream exhibits a long-tailed distribution, termed as Long-Tailed Continual Learning (LTCL). Existing LTCL solutions generally require the label distribution of the data stream to achieve re-balance training. However, obtaining such prior information is often infeasible in real scenarios since the model should learn without pre-identifying the majority and minority classes. To this end, we propose a novel Prior-free Balanced Replay (PBR) framework to learn from long-tailed data stream with less forgetting. Concretely, motivated by our experimental finding that the minority classes are more likely to be forgotten due to the higher uncertainty, we newly design an uncertainty-guided reservoir sampling strategy to prioritize rehearsing minority data without using any prior information, which is based on the mutual dependence between the model and samples. Additionally, we incorporate two prior-free components to further reduce the forgetting issue: (1) Boundary constraint is to preserve uncertain boundary supporting samples for continually re-estimating task boundaries. (2) Prototype constraint is to maintain the consistency of learned class prototypes along with training. Our approach is evaluated on three standard long-tailed benchmarks, demonstrating superior performance to existing CL methods and previous SOTA LTCL approach in both task- and class-incremental learning settings, as well as ordered- and shuffled-LTCL settings.
