Shrinking Your TimeStep: Towards Low-Latency Neuromorphic Object Recognition with Spiking Neural Network
Yongqi Ding, Lin Zuo, Mengmeng Jing, Pei He, Yongjun Xiao
TL;DR
This work tackles the latency–accuracy tension in neuromorphic object recognition by introducing Shrinking SNN (SSNN), which splits the network into stages with progressively shrinking timesteps and a temporal transformer to preserve information across temporal scales. It augments training with multiple early classifiers to provide immediate gradient feedback, mitigating surrogate-gradient mismatch and gradient vanishing/exploding without inflating inference cost. Empirical results on CIFAR10-DVS, N-Caltech101, and DVS-Gesture show substantial gains at low average timesteps (e.g., 5), including 73.63% on CIFAR10-DVS without augmentation and up to 90.74% on DVS-Gesture, outperforming several state-of-the-art approaches at similar latencies. The findings demonstrate the effectiveness of heterogeneous temporal scales for achieving high-performance, low-latency SNNs and offer practical guidance for designing efficient neuromorphic recognition systems.
Abstract
Neuromorphic object recognition with spiking neural networks (SNNs) is the cornerstone of low-power neuromorphic computing. However, existing SNNs suffer from significant latency, utilizing 10 to 40 timesteps or more, to recognize neuromorphic objects. At low latencies, the performance of existing SNNs is drastically degraded. In this work, we propose the Shrinking SNN (SSNN) to achieve low-latency neuromorphic object recognition without reducing performance. Concretely, we alleviate the temporal redundancy in SNNs by dividing SNNs into multiple stages with progressively shrinking timesteps, which significantly reduces the inference latency. During timestep shrinkage, the temporal transformer smoothly transforms the temporal scale and preserves the information maximally. Moreover, we add multiple early classifiers to the SNN during training to mitigate the mismatch between the surrogate gradient and the true gradient, as well as the gradient vanishing/exploding, thus eliminating the performance degradation at low latency. Extensive experiments on neuromorphic datasets, CIFAR10-DVS, N-Caltech101, and DVS-Gesture have revealed that SSNN is able to improve the baseline accuracy by 6.55% ~ 21.41%. With only 5 average timesteps and without any data augmentation, SSNN is able to achieve an accuracy of 73.63% on CIFAR10-DVS. This work presents a heterogeneous temporal scale SNN and provides valuable insights into the development of high-performance, low-latency SNNs.
