HM-DF SNN: Transcending Conventional Online Learning with Advanced Training and Deployment
Zecheng Hao, Yifan Huang, Zijie Xu, Wenxuan Liu, Yuanhong Tang, Zhaofei Yu, Tiejun Huang
TL;DR
This paper introduces Hybrid Mechanism-Driven Firing (HM-DF), a family of online-training-friendly spiking neural networks that separate temporal gradients across firing-threshold regions to resolve forward–backward discrepancies and surrogate-gradient mismatch. By incorporating Precise-Positioning Reset, membrane-potential batch-normalization, and a lightweight channel-attention SECA module, HM-DF achieves gradient alignment and enables full-stage optimization for training and inference with minimal power overhead. The framework employs training-time acceleration (Random Backprop) and inference-time parallelism alongside ultra-low-bit synapses (1-bit/1.5-bit) to compress memory without sacrificing accuracy, and demonstrates state-of-the-art performance on CIFAR-10/100, ImageNet-200/1k, and DVS-CIFAR10. SECA further boosts performance with tiny parameter overhead, yielding up to ~1.80 percentage points gains on neuromorphic data. Overall, HM-DF offers a practical, efficient path to high-accuracy online learning and deployment of SNNs in real-world settings.
Abstract
Spiking Neural Networks (SNNs) are considered to have enormous potential in the future development of Artificial Intelligence due to their brain-inspired and energy-efficient properties. Compared to vanilla Spatial-Temporal Back-propagation (STBP) training methods, online training can effectively overcome the risk of GPU memory explosion. However, current online learning framework cannot tackle the inseparability problem of temporal dependent gradients and merely aim to optimize the training memory, resulting in no performance advantages compared to the STBP training models in the inference phase. To address the aforementioned challenges, we propose Hybrid Mechanism-Driven Firing (HM-DF) model, which is a family of advanced models that respectively adopt different spiking calculation schemes in the upper-region and lower-region of the firing threshold. We point out that HM-DF model can effectively separate temporal gradients and tackle the mismatch problem of surrogate gradients, as well as achieving full-stage optimization towards computation speed and memory footprint. Experimental results have demonstrated that HM-DF model can be flexibly combined with various techniques to achieve state-of-the-art performance in the field of online learning, without triggering further power consumption.
