Rethinking Class-Incremental Learning from a Dynamic Imbalanced Learning Perspective
Leyuan Wang, Liuyu Xiang, Yunlong Wang, Huijia Wu, Zhaofeng He
TL;DR
This work reframes catastrophic forgetting in Class-Incremental Learning as a dynamic imbalance problem and introduces Uniform Prototype Contrastive Learning (UPCL). UPCL learns uniform and compact features by using non-learnable prototypes uniformly distributed on the unit hypersphere $S^{d-1}$, assigns new classes to prototypes with a Hungarian-based alignment, and optimizes a prototype contrastive loss with a dynamic margin $m = -\log p(y)$ that adapts to class priors. The method combines a proto loss and a decaying feature loss to maintain stability of old knowledge while accommodating new classes, achieving state-of-the-art results on CIFAR100, ImageNet100, and TinyImageNet under memory constraints. The work highlights the importance of representation learning under dynamic imbalance for effective lifelong learning, while acknowledging the dimensionality limitation and proposing future work on dynamic prototype dimensionality and unknown class counts.
Abstract
Deep neural networks suffer from catastrophic forgetting when continually learning new concepts. In this paper, we analyze this problem from a data imbalance point of view. We argue that the imbalance between old task and new task data contributes to forgetting of the old tasks. Moreover, the increasing imbalance ratio during incremental learning further aggravates the problem. To address the dynamic imbalance issue, we propose Uniform Prototype Contrastive Learning (UPCL), where uniform and compact features are learned. Specifically, we generate a set of non-learnable uniform prototypes before each task starts. Then we assign these uniform prototypes to each class and guide the feature learning through prototype contrastive learning. We also dynamically adjust the relative margin between old and new classes so that the feature distribution will be maintained balanced and compact. Finally, we demonstrate through extensive experiments that the proposed method achieves state-of-the-art performance on several benchmark datasets including CIFAR100, ImageNet100 and TinyImageNet.
