Enhanced Self-Distillation Framework for Efficient Spiking Neural Network Training
Xiaochen Zhao, Chengting Yu, Kairong Yu, Lei Liu, Aili Wang
TL;DR
The paper tackles the challenge of training high-performance Spiking Neural Networks under limited compute by introducing a rate-based training framework augmented with lightweight auxiliary ANN branches. It further proposes reliability-separated self-distillation to selectively utilize only trustworthy teacher signals from multiple branches, mitigating negative transfer from unreliable predictions. Empirical results on CIFAR-10/100, ImageNet, and CIFAR10-DVS demonstrate substantial reductions in training memory and time while achieving competitive accuracy, bridging the gap between efficient rate-based methods and BPTT-based direct training. The approach offers practical benefits for energy-efficient SNN deployment and provides open-source code for reproducibility.
Abstract
Spiking Neural Networks (SNNs) exhibit exceptional energy efficiency on neuromorphic hardware due to their sparse activation patterns. However, conventional training methods based on surrogate gradients and Backpropagation Through Time (BPTT) not only lag behind Artificial Neural Networks (ANNs) in performance, but also incur significant computational and memory overheads that grow linearly with the temporal dimension. To enable high-performance SNN training under limited computational resources, we propose an enhanced self-distillation framework, jointly optimized with rate-based backpropagation. Specifically, the firing rates of intermediate SNN layers are projected onto lightweight ANN branches, and high-quality knowledge generated by the model itself is used to optimize substructures through the ANN pathways. Unlike traditional self-distillation paradigms, we observe that low-quality self-generated knowledge may hinder convergence. To address this, we decouple the teacher signal into reliable and unreliable components, ensuring that only reliable knowledge is used to guide the optimization of the model. Extensive experiments on CIFAR-10, CIFAR-100, CIFAR10-DVS, and ImageNet demonstrate that our method reduces training complexity while achieving high-performance SNN training. Our code is available at https://github.com/Intelli-Chip-Lab/enhanced-self-distillation-framework-for-snn.
