Table of Contents
Fetching ...

To Spike or Not to Spike, that is the Question

Sanaz Mahmoodi Takaghaj, Jack Sampson

TL;DR

This work tackles the challenge of training spiking neural networks (SNNs) where spike generation is non-differentiable and highly sensitive to the neuron threshold. It introduces Rouser, an approach that treats neuron thresholds $Th^{l}_{i}$ as trainable parameters, optimizing them jointly with synaptic weights through spatiotemporal backpropagation with surrogate gradients. Empirical results on NMNIST, DVS128, and SHD show up to $30\%$ faster convergence in epochs and up to $2\%$ improvements in accuracy, along with a reduction in dead neurons and more robust learning dynamics. The method promises more reliable and efficient SNN training across neuromorphic platforms, enabling better real-time, event-driven processing.

Abstract

Neuromorphic computing has recently gained momentum with the emergence of various neuromorphic processors. As the field advances, there is an increasing focus on developing training methods that can effectively leverage the unique properties of spiking neural networks (SNNs). SNNs emulate the temporal dynamics of biological neurons, making them particularly well-suited for real-time, event-driven processing. To fully harness the potential of SNNs across different neuromorphic platforms, effective training methodologies are essential. In SNNs, learning rules are based on neurons' spiking behavior, that is, if and when spikes are generated due to a neuron's membrane potential exceeding that neuron's spiking threshold, and this spike timing encodes vital information. However, the threshold is generally treated as a hyperparameter, and incorrect selection can lead to neurons that do not spike for large portions of the training process, hindering the effective rate of learning. This work focuses on the significance of learning neuron thresholds alongside weights in SNNs. Our results suggest that promoting threshold from a hyperparameter to a trainable parameter effectively addresses the issue of dead neurons during training. This leads to a more robust training algorithm, resulting in improved convergence, increased test accuracy, and a substantial reduction in the number of training epochs required to achieve viable accuracy on spatiotemporal datasets such as NMNIST, DVS128, and Spiking Heidelberg Digits (SHD), with up to 30% training speed-up and up to 2% higher accuracy on these datasets.

To Spike or Not to Spike, that is the Question

TL;DR

This work tackles the challenge of training spiking neural networks (SNNs) where spike generation is non-differentiable and highly sensitive to the neuron threshold. It introduces Rouser, an approach that treats neuron thresholds as trainable parameters, optimizing them jointly with synaptic weights through spatiotemporal backpropagation with surrogate gradients. Empirical results on NMNIST, DVS128, and SHD show up to faster convergence in epochs and up to improvements in accuracy, along with a reduction in dead neurons and more robust learning dynamics. The method promises more reliable and efficient SNN training across neuromorphic platforms, enabling better real-time, event-driven processing.

Abstract

Neuromorphic computing has recently gained momentum with the emergence of various neuromorphic processors. As the field advances, there is an increasing focus on developing training methods that can effectively leverage the unique properties of spiking neural networks (SNNs). SNNs emulate the temporal dynamics of biological neurons, making them particularly well-suited for real-time, event-driven processing. To fully harness the potential of SNNs across different neuromorphic platforms, effective training methodologies are essential. In SNNs, learning rules are based on neurons' spiking behavior, that is, if and when spikes are generated due to a neuron's membrane potential exceeding that neuron's spiking threshold, and this spike timing encodes vital information. However, the threshold is generally treated as a hyperparameter, and incorrect selection can lead to neurons that do not spike for large portions of the training process, hindering the effective rate of learning. This work focuses on the significance of learning neuron thresholds alongside weights in SNNs. Our results suggest that promoting threshold from a hyperparameter to a trainable parameter effectively addresses the issue of dead neurons during training. This leads to a more robust training algorithm, resulting in improved convergence, increased test accuracy, and a substantial reduction in the number of training epochs required to achieve viable accuracy on spatiotemporal datasets such as NMNIST, DVS128, and Spiking Heidelberg Digits (SHD), with up to 30% training speed-up and up to 2% higher accuracy on these datasets.
Paper Structure (8 sections, 5 equations, 6 figures, 2 tables)

This paper contains 8 sections, 5 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Illustration of our (two-layer) SNN architecture with spatiotemporal error backpropagation. Arrows pointing to the right and top indicate the forward path, while arrows pointing to the left and bottom represent the backward path.
  • Figure 2: Threshold grid search on NMNIST and DVS128.
  • Figure 3: Weight updates wrt. initial weights during training.
  • Figure 4: The percentage of "dead neurons" for each layer during training on NMNIST.
  • Figure 5: Average spike rates.
  • ...and 1 more figures