Table of Contents
Fetching ...

Stabilizing Direct Training of Spiking Neural Networks: Membrane Potential Initialization and Threshold-robust Surrogate Gradient

Hyunho Kook, Byeongho Yu, Jeong Min Oh, Eunhyeok Park

TL;DR

This work tackles two fundamental bottlenecks in direct training of Spiking Neural Networks: temporal covariate shift in membrane dynamics and unstable gradient flow when the neuron threshold is learnable. It introduces MP-Init, which aligns initial layer membrane potentials to the stationary distribution via a running-mean initialization, and TrSG, a threshold-robust surrogate gradient that preserves stable gradient flow across varying $V_{ ext{thr}}$ by combining a relative-scale argument with forward-threshold scaling. Theoretical results (Doeblin-based convergence) and Lemmas underpin MP-Init, while extensive experiments on CIFAR-10/100, ImageNet, and DVS-CIFAR10 demonstrate state-of-the-art accuracy with minimal overhead, including effective extension to Transformer-based SNNs. The methods yield stable, energy-efficient training and inference across static and event-based datasets, enabling practical deployment of high-performance SNNs on neuromorphic hardware.

Abstract

Recent advancements in the direct training of Spiking Neural Networks (SNNs) have demonstrated high-quality outputs even at early timesteps, paving the way for novel energy-efficient AI paradigms. However, the inherent non-linearity and temporal dependencies in SNNs introduce persistent challenges, such as temporal covariate shift (TCS) and unstable gradient flow with learnable neuron thresholds. In this paper, we present two key innovations: MP-Init (Membrane Potential Initialization) and TrSG (Threshold-robust Surrogate Gradient). MP-Init addresses TCS by aligning the initial membrane potential with its stationary distribution, while TrSG stabilizes gradient flow with respect to threshold voltage during training. Extensive experiments validate our approach, achieving state-of-the-art accuracy on both static and dynamic image datasets. The code is available at: https://github.com/kookhh0827/SNN-MP-Init-TRSG

Stabilizing Direct Training of Spiking Neural Networks: Membrane Potential Initialization and Threshold-robust Surrogate Gradient

TL;DR

This work tackles two fundamental bottlenecks in direct training of Spiking Neural Networks: temporal covariate shift in membrane dynamics and unstable gradient flow when the neuron threshold is learnable. It introduces MP-Init, which aligns initial layer membrane potentials to the stationary distribution via a running-mean initialization, and TrSG, a threshold-robust surrogate gradient that preserves stable gradient flow across varying by combining a relative-scale argument with forward-threshold scaling. Theoretical results (Doeblin-based convergence) and Lemmas underpin MP-Init, while extensive experiments on CIFAR-10/100, ImageNet, and DVS-CIFAR10 demonstrate state-of-the-art accuracy with minimal overhead, including effective extension to Transformer-based SNNs. The methods yield stable, energy-efficient training and inference across static and event-based datasets, enabling practical deployment of high-performance SNNs on neuromorphic hardware.

Abstract

Recent advancements in the direct training of Spiking Neural Networks (SNNs) have demonstrated high-quality outputs even at early timesteps, paving the way for novel energy-efficient AI paradigms. However, the inherent non-linearity and temporal dependencies in SNNs introduce persistent challenges, such as temporal covariate shift (TCS) and unstable gradient flow with learnable neuron thresholds. In this paper, we present two key innovations: MP-Init (Membrane Potential Initialization) and TrSG (Threshold-robust Surrogate Gradient). MP-Init addresses TCS by aligning the initial membrane potential with its stationary distribution, while TrSG stabilizes gradient flow with respect to threshold voltage during training. Extensive experiments validate our approach, achieving state-of-the-art accuracy on both static and dynamic image datasets. The code is available at: https://github.com/kookhh0827/SNN-MP-Init-TRSG

Paper Structure

This paper contains 77 sections, 35 equations, 19 figures, 13 tables, 1 algorithm.

Figures (19)

  • Figure 1: Membrane potential distribution of the third layer's first spiking layer of ResNet-19 on CIFAR100
  • Figure 2: Activation distribution of the convolutional layer after the third layer's first spiking layer of ResNet-19 on CIFAR100
  • Figure 3: AS-SG vs. RS-SG vs. TrSG with a rectangular surrogate gradient ($\gamma=1$). The curve depicts the membrane potential distribution, while the colored boxes represent the surrogate gradient region: the horizontal span indicates the active gradient window, and the vertical extent indicates its magnitude. Panels (a) and (b) compare the behaviors when the threshold is small ($V_{\text{thr}}^{l}\!\ll\!1$) and large ($V_{\text{thr}}^{l}\!\gg\!1$), respectively. AS-SG uses a fixed window of width $\gamma$, which leads to gradient flood or starvation. RS-SG normalizes the window by $V_{\text{thr}}^{l}$ but scales the magnitude as $1/V_{\text{thr}}^{l}$, causing explosion or vanishing. TrSG multiplies the threshold forward during training, canceling the $1/V_{\text{thr}}^{l}$ factor and keeping both window and magnitude balanced across thresholds.
  • Figure 4: PCA visualization of final logits w/o MP-Init.
  • Figure 5: PCA visualization of final logits with MP-Init.
  • ...and 14 more figures