High-Performance Temporal Reversible Spiking Neural Networks with $O(L)$ Training Memory and $O(1)$ Inference Cost
JiaKui Hu, Man Yao, Xuerui Qiu, Yuhong Chou, Yuxuan Cai, Ning Qiao, Yonghong Tian, Bo XU, Guoqi Li
TL;DR
The paper tackles the memory and energy bottlenecks of multi-timestep spiking neural networks by introducing Temporal Reversible SNNs (T-RevSNN). By turning off temporal dynamics for most neurons and enabling reversible temporal transfer only at key spike layers, T-RevSNN achieves $O(L)$ training memory and $O(1)$ inference cost, while maintaining strong accuracy on ImageNet-1k and neuromorphic datasets. The approach combines multi-level temporal-reversible forward information transfer, input encoding grouping, and ConvNeXt-style SNN blocks with a ReZero-enhanced residual design. Compared with state-of-the-art CNN-based SNNs and Transformer-based baselines, T-RevSNN offers significant improvements in training memory (up to $8.6 imes$), training time (up to $2.0 imes$), and inference energy (up to $1.6 imes$), making large-scale, energy-efficient SNNs more practical.
Abstract
Multi-timestep simulation of brain-inspired Spiking Neural Networks (SNNs) boost memory requirements during training and increase inference energy cost. Current training methods cannot simultaneously solve both training and inference dilemmas. This work proposes a novel Temporal Reversible architecture for SNNs (T-RevSNN) to jointly address the training and inference challenges by altering the forward propagation of SNNs. We turn off the temporal dynamics of most spiking neurons and design multi-level temporal reversible interactions at temporal turn-on spiking neurons, resulting in a $O(L)$ training memory. Combined with the temporal reversible nature, we redesign the input encoding and network organization of SNNs to achieve $O(1)$ inference energy cost. Then, we finely adjust the internal units and residual connections of the basic SNN block to ensure the effectiveness of sparse temporal information interaction. T-RevSNN achieves excellent accuracy on ImageNet, while the memory efficiency, training time acceleration, and inference energy efficiency can be significantly improved by $8.6 \times$, $2.0 \times$, and $1.6 \times$, respectively. This work is expected to break the technical bottleneck of significantly increasing memory cost and training time for large-scale SNNs while maintaining high performance and low inference energy cost. Source code and models are available at: https://github.com/BICLab/T-RevSNN.
