Loss shaping enhances exact gradient learning with Eventprop in spiking neural networks
Thomas Nowotny, James P. Turner, James C. Knight
TL;DR
This work tackles the challenge of training spiking neural networks with exact gradients by extending the Eventprop algorithm to a broader class of loss functions, addressing the spike-deletion issue that limited SHD learning. Through loss shaping (L_F, including L_sum, L_sum_exp, and L_time) and targeted augmentations, the authors achieve strong SHD results (up to 93.5±0.7% test accuracy with LOSO cross-validation) and competitive SSC performance (74.1±0.9% test accuracy). The approach leverages a GeNN-based implementation to enable efficient forward and backward passes that scale with the number of spikes rather than timesteps, yielding significant speedups (≈3×) and memory reductions (≈4×) versus BPTT surrogates. The study demonstrates the practical viability of exact-gradient SNNs for keyword recognition tasks on neuromorphic hardware-relevant benchmarks, and outlines future directions toward deeper networks, learning delays, and more biologically plausible neuron models. The findings highlight loss-function design as a critical ingredient for successful exact-gradient learning in SNNs and pave the way for energy-efficient neuromorphic AI with scalable training pipelines.
Abstract
Event-based machine learning promises more energy-efficient AI on future neuromorphic hardware. Here, we investigate how the recently discovered Eventprop algorithm for gradient descent on exact gradients in spiking neural networks can be scaled up to challenging keyword recognition benchmarks. We implemented Eventprop in the GPU-enhanced Neural Networks framework and used it for training recurrent spiking neural networks on the Spiking Heidelberg Digits and Spiking Speech Commands datasets. We found that learning depended strongly on the loss function and extended Eventprop to a wider class of loss functions to enable effective training. We then tested a large number of data augmentations and regularisations as well as exploring different network structures; and heterogeneous and trainable timescales. We found that when combined with two specific augmentations, the right regularisation and a delay line input, Eventprop networks with one recurrent layer achieved state-of-the-art performance on Spiking Heidelberg Digits and good accuracy on Spiking Speech Commands. In comparison to a leading surrogate-gradient-based SNN training method, our GeNN Eventprop implementation is 3X faster and uses 4X less memory. This work is a significant step towards a low-power neuromorphic alternative to current machine learning paradigms.
