RTFormer: Re-parameter TSBN Spiking Transformer
Hongzhi Wang, Xiubo Liang, Mengjian Li, Tao Zhang
TL;DR
RTFormer addresses the challenge of achieving high accuracy in Spiking Neural Networks without sacrificing energy efficiency on neuromorphic hardware. It introduces a Spatial-Temporal Core that combines structurally reparameterized depthwise convolutions with a Temporal Sliding Batch Normalization (TSBN) that integrates into neuron thresholds, yielding an energy-efficient Spiking Transformer. The approach delivers state-of-the-art-like performance on ImageNet and CIFAR-10/100 while reducing energy consumption, and demonstrates strong results on neuromorphic datasets (CIFAR10-DVS, DVS128 Gesture) due to enhanced temporal processing. The work provides detailed energy analysis and ablations, underscoring the practical impact of TSBN and reparameterized spatial blocks for real-world neuromorphic deployment.
Abstract
The Spiking Neural Networks (SNNs), renowned for their bio-inspired operational mechanism and energy efficiency, mirror the human brain's neural activity. Yet, SNNs face challenges in balancing energy efficiency with the computational demands of advanced tasks. Our research introduces the RTFormer, a novel architecture that embeds Re-parameterized Temporal Sliding Batch Normalization (TSBN) within the Spiking Transformer framework. This innovation optimizes energy usage during inference while ensuring robust computational performance. The crux of RTFormer lies in its integration of reparameterized convolutions and TSBN, achieving an equilibrium between computational prowess and energy conservation.
