Table of Contents
Fetching ...

HM-DF SNN: Transcending Conventional Online Learning with Advanced Training and Deployment

Zecheng Hao, Yifan Huang, Zijie Xu, Wenxuan Liu, Yuanhong Tang, Zhaofei Yu, Tiejun Huang

TL;DR

This paper introduces Hybrid Mechanism-Driven Firing (HM-DF), a family of online-training-friendly spiking neural networks that separate temporal gradients across firing-threshold regions to resolve forward–backward discrepancies and surrogate-gradient mismatch. By incorporating Precise-Positioning Reset, membrane-potential batch-normalization, and a lightweight channel-attention SECA module, HM-DF achieves gradient alignment and enables full-stage optimization for training and inference with minimal power overhead. The framework employs training-time acceleration (Random Backprop) and inference-time parallelism alongside ultra-low-bit synapses (1-bit/1.5-bit) to compress memory without sacrificing accuracy, and demonstrates state-of-the-art performance on CIFAR-10/100, ImageNet-200/1k, and DVS-CIFAR10. SECA further boosts performance with tiny parameter overhead, yielding up to ~1.80 percentage points gains on neuromorphic data. Overall, HM-DF offers a practical, efficient path to high-accuracy online learning and deployment of SNNs in real-world settings.

Abstract

Spiking Neural Networks (SNNs) are considered to have enormous potential in the future development of Artificial Intelligence due to their brain-inspired and energy-efficient properties. Compared to vanilla Spatial-Temporal Back-propagation (STBP) training methods, online training can effectively overcome the risk of GPU memory explosion. However, current online learning framework cannot tackle the inseparability problem of temporal dependent gradients and merely aim to optimize the training memory, resulting in no performance advantages compared to the STBP training models in the inference phase. To address the aforementioned challenges, we propose Hybrid Mechanism-Driven Firing (HM-DF) model, which is a family of advanced models that respectively adopt different spiking calculation schemes in the upper-region and lower-region of the firing threshold. We point out that HM-DF model can effectively separate temporal gradients and tackle the mismatch problem of surrogate gradients, as well as achieving full-stage optimization towards computation speed and memory footprint. Experimental results have demonstrated that HM-DF model can be flexibly combined with various techniques to achieve state-of-the-art performance in the field of online learning, without triggering further power consumption.

HM-DF SNN: Transcending Conventional Online Learning with Advanced Training and Deployment

TL;DR

This paper introduces Hybrid Mechanism-Driven Firing (HM-DF), a family of online-training-friendly spiking neural networks that separate temporal gradients across firing-threshold regions to resolve forward–backward discrepancies and surrogate-gradient mismatch. By incorporating Precise-Positioning Reset, membrane-potential batch-normalization, and a lightweight channel-attention SECA module, HM-DF achieves gradient alignment and enables full-stage optimization for training and inference with minimal power overhead. The framework employs training-time acceleration (Random Backprop) and inference-time parallelism alongside ultra-low-bit synapses (1-bit/1.5-bit) to compress memory without sacrificing accuracy, and demonstrates state-of-the-art performance on CIFAR-10/100, ImageNet-200/1k, and DVS-CIFAR10. SECA further boosts performance with tiny parameter overhead, yielding up to ~1.80 percentage points gains on neuromorphic data. Overall, HM-DF offers a practical, efficient path to high-accuracy online learning and deployment of SNNs in real-world settings.

Abstract

Spiking Neural Networks (SNNs) are considered to have enormous potential in the future development of Artificial Intelligence due to their brain-inspired and energy-efficient properties. Compared to vanilla Spatial-Temporal Back-propagation (STBP) training methods, online training can effectively overcome the risk of GPU memory explosion. However, current online learning framework cannot tackle the inseparability problem of temporal dependent gradients and merely aim to optimize the training memory, resulting in no performance advantages compared to the STBP training models in the inference phase. To address the aforementioned challenges, we propose Hybrid Mechanism-Driven Firing (HM-DF) model, which is a family of advanced models that respectively adopt different spiking calculation schemes in the upper-region and lower-region of the firing threshold. We point out that HM-DF model can effectively separate temporal gradients and tackle the mismatch problem of surrogate gradients, as well as achieving full-stage optimization towards computation speed and memory footprint. Experimental results have demonstrated that HM-DF model can be flexibly combined with various techniques to achieve state-of-the-art performance in the field of online learning, without triggering further power consumption.

Paper Structure

This paper contains 17 sections, 2 theorems, 11 equations, 3 figures, 5 tables.

Key Result

Theorem 4.2

In the following two cases, the back-propagation of HM-DF model satisfies the condition of Separable Backward Gradient and $\forall i>t, \bm{\epsilon}^l[i,t] = \bm{0}$: (i) $\mathbf{I}_1^l\geq \theta_1^l-\lambda_1^l\mathbf{v}_0^l$; $\forall t\geq 2, \mathbf{I}_{t}^l\geq \theta_t^l-\lambda_t^l\theta^

Figures (3)

  • Figure 1: Various training frameworks for SNNs in synaptic and neuron layers. (a): STBP training, (b): vanilla online training based on LIF model, (c)-(e): online training based on HM-DF model.
  • Figure 2: Various versions and blocks of HM-DF model. (a): parallel computation, (b): learnable membrane-parameters, (c): membrane potential batch-normalization, (d) the model after re-parameterization in the inference stage; (e): vanilla residual block, (f): parallel acceleration block, (g)-(h): blocks based on SECA.
  • Figure 3: Comparison of accuracy and power consumption for different online learning frameworks in the inference phase. Here Case1 denotes vanilla online learning, Cases 2-3 represent our schemes with 1/1.5-bit synaptic weights, respectively.

Theorems & Definitions (4)

  • Definition 4.1
  • Theorem 4.2
  • Corollary 4.3
  • proof