Towards Accurate Binary Spiking Neural Networks: Learning with Adaptive Gradient Modulation Mechanism
Yu Liang, Wenjie Wei, Ammar Belatreche, Honglin Cao, Zijian Zhou, Shuai Wang, Malu Zhang, Yang Yang
TL;DR
This work tackles the training difficulty of Binary Spiking Neural Networks (BSNNs) caused by frequent weight sign flipping arising from non-differentiable spike functions. It introduces Adaptive Gradient Modulation Mechanism (AGMM), which adaptively scales gradient magnitudes through forward gating with a time-dependent factor $\alpha[t]$ and a sigmoid-based modulation $\sigma(E(X[t]))$, thereby reducing gradient mean $\mu$ and variance $\sigma^2$ that drive flips. The approach yields faster convergence and improved accuracy, achieving state-of-the-art performance on static datasets (CIFAR-10/100, ImageNet) and neuromorphic benchmarks, while maintaining BSNN efficiency with lower firing rates and energy use. Theoretical analysis connects gradient statistics to flip frequency, and extensive experiments demonstrate both the efficacy and efficiency of AGMM, setting a new reference for binarized SNN training and suggesting avenues for scaling to larger SNN architectures.
Abstract
Binary Spiking Neural Networks (BSNNs) inherit the eventdriven paradigm of SNNs, while also adopting the reduced storage burden of binarization techniques. These distinct advantages grant BSNNs lightweight and energy-efficient characteristics, rendering them ideal for deployment on resource-constrained edge devices. However, due to the binary synaptic weights and non-differentiable spike function, effectively training BSNNs remains an open question. In this paper, we conduct an in-depth analysis of the challenge for BSNN learning, namely the frequent weight sign flipping problem. To mitigate this issue, we propose an Adaptive Gradient Modulation Mechanism (AGMM), which is designed to reduce the frequency of weight sign flipping by adaptively adjusting the gradients during the learning process. The proposed AGMM can enable BSNNs to achieve faster convergence speed and higher accuracy, effectively narrowing the gap between BSNNs and their full-precision equivalents. We validate AGMM on both static and neuromorphic datasets, and results indicate that it achieves state-of-the-art results among BSNNs. This work substantially reduces storage demands and enhances SNNs' inherent energy efficiency, making them highly feasible for resource-constrained environments.
