Reconsidering the energy efficiency of spiking neural networks

Zhanglu Yan; Zhenyu Bai; Weng-Fai Wong

Reconsidering the energy efficiency of spiking neural networks

Zhanglu Yan, Zhenyu Bai, Weng-Fai Wong

TL;DR

This paper establishes a fair baseline by mapping rate-encoded SNNs with $T$ timesteps to functionally equivalent QNNs with similar hardware requirement, enabling meaningful energy comparisons and introduces a detailed analytical energy model encompassing core computation and data movement.

Abstract

Spiking Neural Networks (SNNs) promise higher energy efficiency over conventional Quantized Artificial Neural Networks (QNNs) due to their event-driven, spike-based computation. However, prevailing energy evaluations often oversimplify, focusing on computational aspects while neglecting critical overheads like comprehensive data movement and memory access. Such simplifications can lead to misleading conclusions regarding the true energy benefits of SNNs. This paper presents a rigorous re-evaluation. We establish a fair baseline by mapping rate-encoded SNNs with $T$ timesteps to functionally equivalent QNNs with $\lceil \log_2(T+1) \rceil$ bits. This ensures both models have comparable representational capacities, as well has similar hardware requirement, enabling meaningful energy comparisons. We introduce a detailed analytical energy model encompassing core computation and data movement. Using this model, we systematically explore a wide parameter space, including intrinsic network characteristics ($T$, spike rate $\SR$, QNN sparsity $γ$, model size $N$, weight bit-level) and hardware characteristics (memory system and network-on-chip). Our analysis identifies specific operational regimes where SNNs genuinely offer superior energy efficiency. For example, under typical neuromorphic hardware conditions, SNNs with moderate time windows ($T \in [5,10]$) require an average spike rate ($\SR$) below 6.4\% to outperform equivalent QNNs. Furthermore, to illustrate the real-world implications of our findings, we analyze the operational lifetime of a typical smartwatch, showing that an optimized SNN can nearly double its battery life compared to a QNN. These insights guide the design of turely energy-efficient neural network solutions.

Reconsidering the energy efficiency of spiking neural networks

TL;DR

This paper establishes a fair baseline by mapping rate-encoded SNNs with

timesteps to functionally equivalent QNNs with similar hardware requirement, enabling meaningful energy comparisons and introduces a detailed analytical energy model encompassing core computation and data movement.

Abstract

timesteps to functionally equivalent QNNs with

bits. This ensures both models have comparable representational capacities, as well has similar hardware requirement, enabling meaningful energy comparisons. We introduce a detailed analytical energy model encompassing core computation and data movement. Using this model, we systematically explore a wide parameter space, including intrinsic network characteristics (

, spike rate

, QNN sparsity

, model size

, weight bit-level) and hardware characteristics (memory system and network-on-chip). Our analysis identifies specific operational regimes where SNNs genuinely offer superior energy efficiency. For example, under typical neuromorphic hardware conditions, SNNs with moderate time windows (

) require an average spike rate (

) below 6.4\% to outperform equivalent QNNs. Furthermore, to illustrate the real-world implications of our findings, we analyze the operational lifetime of a typical smartwatch, showing that an optimized SNN can nearly double its battery life compared to a QNN. These insights guide the design of turely energy-efficient neural network solutions.

Paper Structure (22 sections, 5 theorems, 19 equations, 7 figures, 2 tables)

This paper contains 22 sections, 5 theorems, 19 equations, 7 figures, 2 tables.

Introduction
The QNN-SNN Twins
Hardware Energy Consumption
Key contributions
Information Representation Equivalence between SNN and ANN
Hardware
Core computing $(E_{Compute})$
Data Movement Energy ($E_{data}$)
Results
Energy Efficiency Landscape Across Diverse Configurations
Sensitivity Analysis of Neural Network Parameters
Analyses on the effect of mapping
Modeling routing distance of data movements
Modeling weight reuse
Analysis results on the number of hops $k_{\text{hop}}$
...and 7 more sections

Key Result

Theorem 1

For a rate-encoded IF SNN operating with a time window of size $T$, there exists an equivalent QNN whose activations are quantized to at most $\lceil \log_2(T+1) \rceil$ bits, such that both networks possess a comparable information representation capability at the single neuron output level.

Figures (7)

Figure 1: Integrate-and-fire SNN model
Figure 2: A classical neuromorphic processing element (PE) array with a Network-on-Chip (NoC) for inter-core communication lines2018loihinorthpolepehle2022brainscales.
Figure 3: Energy ratio $E_{\text{SNN}}/E_{\text{ANN}}$ across SNN Model Configurations (rows, defined by $T, s_r$) and Hardware Settings (columns). Within each cell, three bars correspond to comparing the SNN against QNNs with three activation densities. All calculations assume $N_{\text{src}}=4096$ and 8-bit weights. Fundamental operational costs are: $E_{\text{ACC}}=0.05448 \text{ pJ}$, $E_{\text{CMP}}=0.05448 \text{ pJ}$, $E_{\text{SUB}}=0.05448 \text{ pJ}$. The QNN $E_{\text{MAC}}$ cost varies with $T$ as detailed in Figure \ref{['fig:mac_acc']}. These energy values are based on a 22nm technology process.
Figure 4: Mac vs Acc
Figure 5: Sensitivity analysis of SNN and QNN energy consumption (pJ) versus SNN spike rate ($s_r$) under the Typical Neuromorphic hardware setting ($\overline{E}^{\text{move}}=0.25 \text{ pJ/bit/hop}$, $\widetilde{E}^{\text{move}}=3 \text{ pJ/bit/hop}$, number of hop equal to 1). Each subplot corresponds to a specific SNN time window $T$ (rows, $T \in [1,8]$) and a configuration of weight precision (4-bit or 8-bit) and network size ($N_{\text{src}} \in \{64, 4096\}$) (columns). QNN energy (horizontal line) is calculated for a fixed activation sparsity of $\gamma=0.8$.
...and 2 more figures

Theorems & Definitions (9)

Theorem 1
proof : Proof Outline
Lemma 1
proof
Lemma 2
proof
Corollary 1
Theorem 2
proof

Reconsidering the energy efficiency of spiking neural networks

TL;DR

Abstract

Reconsidering the energy efficiency of spiking neural networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (9)