BKDSNN: Enhancing the Performance of Learning-based Spiking Neural Networks Training with Blurred Knowledge Distillation
Zekai Xu, Kang You, Qinghai Guo, Xiang Wang, Zhezhi He
TL;DR
BKDSNN addresses the accuracy gap in learning-based SNNs by introducing blurred knowledge distillation (BKD) and a restoration block to better imitate ANN features. BKD is applied to the intermediate feature before the last layer and can be combined with logits-based distillation for a mixed-distillation regime, yielding state-of-the-art results on CIFAR10/100 and ImageNet for both CNN- and Transformer-based SNNs, as well as neuromorphic data like CIFAR10-DVS. The approach improves feature alignment and gradient estimation, enabling strong performance with ultra-low time-steps and offering favorable energy-accuracy trade-offs. The results suggest BKDSNN as a practical and scalable path to narrowing the gap between SNNs and ANNs in real-world Vision tasks.
Abstract
Spiking neural networks (SNNs), which mimic biological neural system to convey information via discrete spikes, are well known as brain-inspired models with excellent computing efficiency. By utilizing the surrogate gradient estimation for discrete spikes, learning-based SNN training methods that can achieve ultra-low inference latency (number of time-step) emerge recently. Nevertheless, due to the difficulty in deriving precise gradient estimation for discrete spikes using learning-based method, a distinct accuracy gap persists between SNN and its artificial neural networks (ANNs) counterpart. To address the aforementioned issue, we propose a blurred knowledge distillation (BKD) technique, which leverages random blurred SNN feature to restore and imitate the ANN feature. Note that, our BKD is applied upon the feature map right before the last layer of SNN, which can also mix with prior logits-based knowledge distillation for maximized accuracy boost. To our best knowledge, in the category of learning-based methods, our work achieves state-of-the-art performance for training SNNs on both static and neuromorphic datasets. On ImageNet dataset, BKDSNN outperforms prior best results by 4.51% and 0.93% with the network topology of CNN and Transformer respectively.
