Table of Contents
Fetching ...

Towards Experience Replay for Class-Incremental Learning in Fully-Binary Networks

Yanis Basso-Bert, Anca Molnos, Romain Lemaire, William Guicquero, Antoine Dupret

TL;DR

This work tackles the challenge of Class Incremental Learning (CIL) in Fully-Binarized Neural Networks (FBNNs) for ultra-low-power edge devices. It introduces BN-free normalization via scaling factors, a Learnable Global Average Pooling (LGAP) bottleneck, and an efficient TYCC thermometer-based input encoding to preserve performance in a fully binary architecture. The study systematically analyzes Experience Replay (ER) in two forms—Native and Latent replay—and augments them with loss balancing and semi-supervised pre-training of the feature extractor to improve transfer and reduce forgetting. Across CIFAR100 and CORE50 benchmarks, a 3 Mb FBNN achieves competitive or superior results to larger real-valued networks, highlighting the practical viability and memory efficiency of on-device CIL for binary architectures, with Latent replay offering the best retention in low-memory regimes and Native replay excelling as memory grows. The findings demonstrate a viable path for continual learning on constrained hardware, combining architectural design, training strategies, and replay choices to balance adaptability and retention.

Abstract

Binary Neural Networks (BNNs) are a promising approach to enable Artificial Neural Network (ANN) implementation on ultra-low power edge devices. Such devices may compute data in highly dynamic environments, in which the classes targeted for inference can evolve or even novel classes may arise, requiring continual learning. Class Incremental Learning (CIL) is a common type of continual learning for classification problems, that has been scarcely addressed in the context of BNNs. Furthermore, most of existing BNNs models are not fully binary, as they require several real-valued network layers, at the input, the output, and for batch normalization. This paper goes a step further, enabling class incremental learning in Fully-Binarized NNs (FBNNs) through four main contributions. We firstly revisit the FBNN design and its training procedure that is suitable to CIL. Secondly, we explore loss balancing, a method to trade-off the performance of past and current classes. Thirdly, we propose a semi-supervised method to pre-train the feature extractor of the FBNN for transferable representations. Fourthly, two conventional CIL methods, \ie, Latent and Native replay, are thoroughly compared. These contributions are exemplified first on the CIFAR100 dataset, before being scaled up to address the CORE50 continual learning benchmark. The final results based on our 3Mb FBNN on CORE50 exhibit at par and better performance than conventional real-valued larger NN models.

Towards Experience Replay for Class-Incremental Learning in Fully-Binary Networks

TL;DR

This work tackles the challenge of Class Incremental Learning (CIL) in Fully-Binarized Neural Networks (FBNNs) for ultra-low-power edge devices. It introduces BN-free normalization via scaling factors, a Learnable Global Average Pooling (LGAP) bottleneck, and an efficient TYCC thermometer-based input encoding to preserve performance in a fully binary architecture. The study systematically analyzes Experience Replay (ER) in two forms—Native and Latent replay—and augments them with loss balancing and semi-supervised pre-training of the feature extractor to improve transfer and reduce forgetting. Across CIFAR100 and CORE50 benchmarks, a 3 Mb FBNN achieves competitive or superior results to larger real-valued networks, highlighting the practical viability and memory efficiency of on-device CIL for binary architectures, with Latent replay offering the best retention in low-memory regimes and Native replay excelling as memory grows. The findings demonstrate a viable path for continual learning on constrained hardware, combining architectural design, training strategies, and replay choices to balance adaptability and retention.

Abstract

Binary Neural Networks (BNNs) are a promising approach to enable Artificial Neural Network (ANN) implementation on ultra-low power edge devices. Such devices may compute data in highly dynamic environments, in which the classes targeted for inference can evolve or even novel classes may arise, requiring continual learning. Class Incremental Learning (CIL) is a common type of continual learning for classification problems, that has been scarcely addressed in the context of BNNs. Furthermore, most of existing BNNs models are not fully binary, as they require several real-valued network layers, at the input, the output, and for batch normalization. This paper goes a step further, enabling class incremental learning in Fully-Binarized NNs (FBNNs) through four main contributions. We firstly revisit the FBNN design and its training procedure that is suitable to CIL. Secondly, we explore loss balancing, a method to trade-off the performance of past and current classes. Thirdly, we propose a semi-supervised method to pre-train the feature extractor of the FBNN for transferable representations. Fourthly, two conventional CIL methods, \ie, Latent and Native replay, are thoroughly compared. These contributions are exemplified first on the CIFAR100 dataset, before being scaled up to address the CORE50 continual learning benchmark. The final results based on our 3Mb FBNN on CORE50 exhibit at par and better performance than conventional real-valued larger NN models.

Paper Structure

This paper contains 54 sections, 13 equations, 8 figures, 8 tables.

Figures (8)

  • Figure 1: Baseline model structure. 3Mb-BNN version reported here.
  • Figure 2: Offline accuracy for different configurations of our architecture. Test accuracy is reported with circle and train accuracy with squares.
  • Figure 3: Training curves of 3Mb-BNN on the CIL CIFAR50+5X10, for the 4 baselines strategies. Accuracy on train set is reported on black, pre-training test accuracy in grey, and the test accuracy of each task in a different colors.
  • Figure 4: Influence of loss and buffer size in RPT and FPT scenarios. Latent replay and Native replay strategies are reported with various losses (colors).
  • Figure 5: Illustration of the multi-objective pre-training framework.
  • ...and 3 more figures