BSO: Binary Spiking Online Optimization Algorithm
Yu Liang, Yu Yang, Wenjie Wei, Ammar Belatreche, Shuai Wang, Malu Zhang, Yang Yang
TL;DR
Binary Spiking Online (BSO) introduces an online training algorithm specifically for Binary Spiking Neural Networks (BSNNs) to reduce training memory by updating weights via flip signals, removing latent weights. A temporal-aware extension, T-BSO, uses first- and second-order gradient moments to adapt flipping thresholds across time steps, capturing BSNN dynamics with thresholds $\gamma_t = \gamma \sqrt{v^l[t] + \epsilon}$. The authors prove regret bounds for both methods under standard online learning assumptions, showing $R(\mathcal{T}) = O(\sqrt{\mathcal{T}})$ for BSO and $R(\mathcal{T}) = O(\mathcal{T}^{3/4})$ for T-BSO, indicating sublinear convergence. Extensive experiments on CIFAR-10/100, ImageNet, and DVS-CIFAR10 demonstrate competitive accuracy with significantly reduced training memory compared to BPTT-based BSNNs, including online performance. Code is released at https://github.com/hamings1/BSO.
Abstract
Binary Spiking Neural Networks (BSNNs) offer promising efficiency advantages for resource-constrained computing. However, their training algorithms often require substantial memory overhead due to latent weights storage and temporal processing requirements. To address this issue, we propose Binary Spiking Online (BSO) optimization algorithm, a novel online training algorithm that significantly reduces training memory. BSO directly updates weights through flip signals under the online training framework. These signals are triggered when the product of gradient momentum and weights exceeds a threshold, eliminating the need for latent weights during training. To enhance performance, we propose T-BSO, a temporal-aware variant that leverages the inherent temporal dynamics of BSNNs by capturing gradient information across time steps for adaptive threshold adjustment. Theoretical analysis establishes convergence guarantees for both BSO and T-BSO, with formal regret bounds characterizing their convergence rates. Extensive experiments demonstrate that both BSO and T-BSO achieve superior optimization performance compared to existing training methods for BSNNs. The codes are available at https://github.com/hamings1/BSO.
