Table of Contents
Fetching ...

All in one timestep: Enhancing Sparsity and Energy efficiency in Multi-level Spiking Neural Networks

Andrea Castagnetti, Alain Pegatoquet, Benoît Miramond

TL;DR

This work tackles information loss and energy efficiency in Spiking Neural Networks by introducing multi-level spiking neurons and a Sparse-ResNet architecture. The multi-level neuron increases per-timestep information throughput by enabling $z(t)\in[0,N]$ and yields $N\times T+1$ quantization levels, while barrier neurons with STE mitigate spike avalanche and improve gradient flow in residual paths. An energy model that accounts for memory accesses demonstrates 2–3× energy savings on CIFAR-10/100 at 1 timestep and substantial latency compression on neuromorphic CIFAR-10-DVS, with Spars-ResNet achieving comparable accuracy while reducing network activity by over 20%. Together, these advances enable high-accuracy, low-latency, and energy-efficient SNNs suitable for on-device neuromorphic deployment and real-time event-based processing.

Abstract

Spiking Neural Networks (SNNs) are one of the most promising bio-inspired neural networks models and have drawn increasing attention in recent years. The event-driven communication mechanism of SNNs allows for sparse and theoretically low-power operations on dedicated neuromorphic hardware. However, the binary nature of instantaneous spikes also leads to considerable information loss in SNNs, resulting in accuracy degradation. To address this issue, we propose a multi-level spiking neuron model able to provide both low-quantization error and minimal inference latency while approaching the performance of full precision Artificial Neural Networks (ANNs). Experimental results with popular network architectures and datasets, show that multi-level spiking neurons provide better information compression, allowing therefore a reduction in latency without performance loss. When compared to binary SNNs on image classification scenarios, multi-level SNNs indeed allow reducing by 2 to 3 times the energy consumption depending on the number of quantization intervals. On neuromorphic data, our approach allows us to drastically reduce the inference latency to 1 timestep, which corresponds to a compression factor of 10 compared to previously published results. At the architectural level, we propose a new residual architecture that we call Sparse-ResNet. Through a careful analysis of the spikes propagation in residual connections we highlight a spike avalanche effect, that affects most spiking residual architectures. Using our Sparse-ResNet architecture, we can provide state-of-the-art accuracy results in image classification while reducing by more than 20% the network activity compared to the previous spiking ResNets.

All in one timestep: Enhancing Sparsity and Energy efficiency in Multi-level Spiking Neural Networks

TL;DR

This work tackles information loss and energy efficiency in Spiking Neural Networks by introducing multi-level spiking neurons and a Sparse-ResNet architecture. The multi-level neuron increases per-timestep information throughput by enabling and yields quantization levels, while barrier neurons with STE mitigate spike avalanche and improve gradient flow in residual paths. An energy model that accounts for memory accesses demonstrates 2–3× energy savings on CIFAR-10/100 at 1 timestep and substantial latency compression on neuromorphic CIFAR-10-DVS, with Spars-ResNet achieving comparable accuracy while reducing network activity by over 20%. Together, these advances enable high-accuracy, low-latency, and energy-efficient SNNs suitable for on-device neuromorphic deployment and real-time event-based processing.

Abstract

Spiking Neural Networks (SNNs) are one of the most promising bio-inspired neural networks models and have drawn increasing attention in recent years. The event-driven communication mechanism of SNNs allows for sparse and theoretically low-power operations on dedicated neuromorphic hardware. However, the binary nature of instantaneous spikes also leads to considerable information loss in SNNs, resulting in accuracy degradation. To address this issue, we propose a multi-level spiking neuron model able to provide both low-quantization error and minimal inference latency while approaching the performance of full precision Artificial Neural Networks (ANNs). Experimental results with popular network architectures and datasets, show that multi-level spiking neurons provide better information compression, allowing therefore a reduction in latency without performance loss. When compared to binary SNNs on image classification scenarios, multi-level SNNs indeed allow reducing by 2 to 3 times the energy consumption depending on the number of quantization intervals. On neuromorphic data, our approach allows us to drastically reduce the inference latency to 1 timestep, which corresponds to a compression factor of 10 compared to previously published results. At the architectural level, we propose a new residual architecture that we call Sparse-ResNet. Through a careful analysis of the spikes propagation in residual connections we highlight a spike avalanche effect, that affects most spiking residual architectures. Using our Sparse-ResNet architecture, we can provide state-of-the-art accuracy results in image classification while reducing by more than 20% the network activity compared to the previous spiking ResNets.

Paper Structure

This paper contains 26 sections, 9 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Multi-level IF neuron model and the associated decoding scheme.
  • Figure 2: Quantization function of a multi-level IF with soft-reset ($T = 2$, $N = 4$, $V_{th} = 1.0$). We can observe that the output of the neuron has exactly $N \times T + 1$ quantization levels and the output saturates when the input equals $V_{th}$.
  • Figure 3: Residual blocks in (a) Spiking ResNets hu_spiking_2023, (b) SEW-ResNets fang_deep_2021, (c) MS-ResNets hu_advancing_2023 and the proposed Sparse-ResNets. SN stands for binary spiking neuron while ml-SN/STE and ml-SN/SG means a multi-level spiking neuron with a Straight-Through Estimator and a surrogate backward function respectively.
  • Figure 4: Spike propagation in residual connections. Here $\gamma_b$ and $\gamma_{mv}$ represent the amount of binary and multi-level spikes at the input of the first residual connection of the network. Only one Convolutional block is represented in the direct path. In the SEW-ResNets fang_deep_2021, the flow of spikes is added at the summation points, thus creating an exponential increase of the number of spikes that have to be processed by deeper layers. Using the barrier neuron (dotted in the figure), the Sparse-ResNets can limit the propagation of spikes without hindering the representation capacity of the network.
  • Figure 5: The derivative, $\sigma^\prime$ , of the sigmoid surrogate used in our study: $\sigma^\prime(x, \alpha) = \alpha\,\sigma(\alpha \cdot x)\,(1 - \sigma(\alpha\cdot x))$. Where $\sigma(x)$ is the sigmoid function and $\alpha$ the scaling factor. Here $\alpha=5$ and $V_{th}=1$. The threshold voltage $V_{th}$ is also plotted as a vertical dotted line.
  • ...and 6 more figures