Enabling On-device Continual Learning with Binary Neural Networks

Lorenzo Vorabbi; Davide Maltoni; Guido Borghi; Stefano Santi

Enabling On-device Continual Learning with Binary Neural Networks

Lorenzo Vorabbi, Davide Maltoni, Guido Borghi, Stefano Santi

TL;DR

This work tackles on-device continual learning under severe memory and compute constraints by combining Binary Neural Networks (BNNs) with Latent Replay and a quantized backpropagation framework. It introduces dual bitwidth training with forward $q_f$ and backward $q_b$ quantization, enabling efficient updates to both convolutional layers and the classifier head while storing 1-bit latent activations in replay memories. Key contributions include reduced replay memory (1-bit activations), improved accuracy over prior BNN-CWR* baselines, quantized backpropagation for non-binary layers, and optimized binary weight quantization yielding substantial memory savings and practical edge-device efficiency. The approach achieves memory reductions up to $32\times$, speedups up to $2.2\times$ on embedded hardware, and demonstrates feasibility for scalable on-device continual learning in TinyML applications, with future work targeting ARM NEON optimization.

Abstract

On-device learning remains a formidable challenge, especially when dealing with resource-constrained devices that have limited computational capabilities. This challenge is primarily rooted in two key issues: first, the memory available on embedded devices is typically insufficient to accommodate the memory-intensive back-propagation algorithm, which often relies on floating-point precision. Second, the development of learning algorithms on models with extreme quantization levels, such as Binary Neural Networks (BNNs), is critical due to the drastic reduction in bit representation. In this study, we propose a solution that combines recent advancements in the field of Continual Learning (CL) and Binary Neural Networks to enable on-device training while maintaining competitive performance. Specifically, our approach leverages binary latent replay (LR) activations and a novel quantization scheme that significantly reduces the number of bits required for gradient computation. The experimental validation demonstrates a significant accuracy improvement in combination with a noticeable reduction in memory requirement, confirming the suitability of our approach in expanding the practical applications of deep learning in real-world scenarios.

Enabling On-device Continual Learning with Binary Neural Networks

TL;DR

and backward

quantization, enabling efficient updates to both convolutional layers and the classifier head while storing 1-bit latent activations in replay memories. Key contributions include reduced replay memory (1-bit activations), improved accuracy over prior BNN-CWR* baselines, quantized backpropagation for non-binary layers, and optimized binary weight quantization yielding substantial memory savings and practical edge-device efficiency. The approach achieves memory reductions up to

, speedups up to

on embedded hardware, and demonstrates feasibility for scalable on-device continual learning in TinyML applications, with future work targeting ARM NEON optimization.

Abstract

Paper Structure (12 sections, 5 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 5 equations, 10 figures, 2 tables, 1 algorithm.

Introduction
Related Work
Method
Continual Learning with Latent Replays
Quantization of activations and weights
Quantized Backpropagation
Experiments
Accuracy comparison
Reducing Storage in Latent Replay
Splitting $q_{b}$ in $q_{b}^{bin}$ and $q_{b}^{non-bin}$
Efficiency Evaluation
Conclusion

Figures (10)

Figure 1: Continual Learning with latent replay memory. When using a BNN the activations stored in the replay memory can be quantized to 1-bit.
Figure 2: Quantization scheme that uses a different number of bitwidth for forward ($q_{f}$) and backward ($q_{b}$) pass. Usually, trainable non-binary layers are Batch Normalization ioffe2015batch, Addition and Concatenation layers.
Figure 3: Accuracy comparison of our solution (BNN+LR+CWR*) with previous work BNN+CWR* vorabbi2023device on CORe50 using quick model.
Figure 4: Accuracy comparison of our solution (BNN+LR+CWR*) with previous work BNN+CWR* vorabbi2023device on CORe50 using QuickNetLarge model.
Figure 5: Accuracy comparison of our solution (BNN+LR+CWR*) with previous work BNN+CWR* vorabbi2023device on CIFAR10 using Reactnet model.
...and 5 more figures

Enabling On-device Continual Learning with Binary Neural Networks

TL;DR

Abstract

Enabling On-device Continual Learning with Binary Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (10)