An experimental comparative study of backpropagation and alternatives for training binary neural networks for image classification
Ben Crulis, Barthelemy Serres, Cyril de Runz, Gilles Venturini
TL;DR
The paper addresses the challenge of training binary neural networks for image classification on edge devices by evaluating multiple learning algorithms beyond standard backpropagation, including DFA, DRTP, FA, HSIC Bottleneck, and SigpropTL, across three architectures on ImageNette and related datasets. It extends prior work by testing more architectures, larger datasets, and adding two new BP alternatives, while providing an open-source PyTorch framework for model-agnostic training of binary and non-binary networks. Key findings show that, on modern architectures with skip connections, backpropagation generally yields the best accuracy, but several alternatives offer substantial reductions in memory and computation and can exceed BP in certain architectures or ablations (e.g., without skip connections). The work provides practical guidance for when to use BP versus alternatives and highlights the trade-offs between accuracy and resource efficiency, informing edge-deployed vision systems and future research avenues.
Abstract
Current artificial neural networks are trained with parameters encoded as floating point numbers that occupy lots of memory space at inference time. Due to the increase in the size of deep learning models, it is becoming very difficult to consider training and using artificial neural networks on edge devices. Binary neural networks promise to reduce the size of deep neural network models, as well as to increase inference speed while decreasing energy consumption. Thus, they may allow the deployment of more powerful models on edge devices. However, binary neural networks are still proven to be difficult to train using the backpropagation-based gradient descent scheme. This paper extends the work of \cite{crulis2023alternatives}, which proposed adapting to binary neural networks two promising alternatives to backpropagation originally designed for continuous neural networks, and experimented with them on simple image classification datasets. This paper proposes new experiments on the ImageNette dataset, compares three different model architectures for image classification, and adds two additional alternatives to backpropagation.
