Improving Continual Learning Performance and Efficiency with Auxiliary Classifiers
Filip Szatkowski, Yaoyue Zheng, Fei Yang, Bartłomiej Twardowski, Tomasz Trzciński, Joost van de Weijer
TL;DR
Catastrophic forgetting in continual learning is mitigated by introducing auxiliary classifiers (ACs) attached to intermediate layers, leveraging the stability of early representations. ACs enable dynamic, early-exit inference and can be trained alongside standard CL models to boost accuracy while reducing computation, achieving about a 10% relative improvement on CIFAR100 and ImageNet100 and 10-60% inference cost reductions. The approach scales across architectures (ResNet, VGG, ViT) and remains beneficial for various CL methods, including exemplar-based and regularization-based strategies. This work offers a practical, scalable means to enhance continual learning performance and efficiency in resource-constrained settings, backed by extensive experiments and reproducible code.
Abstract
Continual learning is crucial for applying machine learning in challenging, dynamic, and often resource-constrained environments. However, catastrophic forgetting - overwriting previously learned knowledge when new information is acquired - remains a major challenge. In this work, we examine the intermediate representations in neural network layers during continual learning and find that such representations are less prone to forgetting, highlighting their potential to accelerate computation. Motivated by these findings, we propose to use auxiliary classifiers(ACs) to enhance performance and demonstrate that integrating ACs into various continual learning methods consistently improves accuracy across diverse evaluation settings, yielding an average 10% relative gain. We also leverage the ACs to reduce the average cost of the inference by 10-60% without compromising accuracy, enabling the model to return the predictions before computing all the layers. Our approach provides a scalable and efficient solution for continual learning.
