Table of Contents
Fetching ...

Q-SNNs: Quantized Spiking Neural Networks

Wenjie Wei, Yu Liang, Ammar Belatreche, Yichen Xiao, Honglin Cao, Zhenbang Ren, Guoqing Wang, Malu Zhang, Yang Yang

TL;DR

This work tackles the energetic and memory demands of large-scale spiking neural networks by introducing Quantized SNNs (Q-SNNs) that quantize both synaptic weights to 1-bit and membrane potentials to low-bit widths. To compensate for information loss due to quantization, it introduces Weight-Spike Dual Regulation (WS-DR), an entropy-based training approach that enhances information content in weights and spikes. Empirical results across static and neuromorphic datasets demonstrate substantial memory reductions (up to ~96%) with competitive or state-of-the-art accuracy, highlighting viable edge deployments and efficient neuromorphic computing. The combination of dual quantization and WS-DR shows promise for practical, energy-efficient SNNs on resource-constrained devices.

Abstract

Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to represent information and process them in an asynchronous event-driven manner, offering an energy-efficient paradigm for the next generation of machine intelligence. However, the current focus within the SNN community prioritizes accuracy optimization through the development of large-scale models, limiting their viability in resource-constrained and low-power edge devices. To address this challenge, we introduce a lightweight and hardware-friendly Quantized SNN (Q-SNN) that applies quantization to both synaptic weights and membrane potentials. By significantly compressing these two key elements, the proposed Q-SNNs substantially reduce both memory usage and computational complexity. Moreover, to prevent the performance degradation caused by this compression, we present a new Weight-Spike Dual Regulation (WS-DR) method inspired by information entropy theory. Experimental evaluations on various datasets, including static and neuromorphic, demonstrate that our Q-SNNs outperform existing methods in terms of both model size and accuracy. These state-of-the-art results in efficiency and efficacy suggest that the proposed method can significantly improve edge intelligent computing.

Q-SNNs: Quantized Spiking Neural Networks

TL;DR

This work tackles the energetic and memory demands of large-scale spiking neural networks by introducing Quantized SNNs (Q-SNNs) that quantize both synaptic weights to 1-bit and membrane potentials to low-bit widths. To compensate for information loss due to quantization, it introduces Weight-Spike Dual Regulation (WS-DR), an entropy-based training approach that enhances information content in weights and spikes. Empirical results across static and neuromorphic datasets demonstrate substantial memory reductions (up to ~96%) with competitive or state-of-the-art accuracy, highlighting viable edge deployments and efficient neuromorphic computing. The combination of dual quantization and WS-DR shows promise for practical, energy-efficient SNNs on resource-constrained devices.

Abstract

Brain-inspired Spiking Neural Networks (SNNs) leverage sparse spikes to represent information and process them in an asynchronous event-driven manner, offering an energy-efficient paradigm for the next generation of machine intelligence. However, the current focus within the SNN community prioritizes accuracy optimization through the development of large-scale models, limiting their viability in resource-constrained and low-power edge devices. To address this challenge, we introduce a lightweight and hardware-friendly Quantized SNN (Q-SNN) that applies quantization to both synaptic weights and membrane potentials. By significantly compressing these two key elements, the proposed Q-SNNs substantially reduce both memory usage and computational complexity. Moreover, to prevent the performance degradation caused by this compression, we present a new Weight-Spike Dual Regulation (WS-DR) method inspired by information entropy theory. Experimental evaluations on various datasets, including static and neuromorphic, demonstrate that our Q-SNNs outperform existing methods in terms of both model size and accuracy. These state-of-the-art results in efficiency and efficacy suggest that the proposed method can significantly improve edge intelligent computing.
Paper Structure (17 sections, 16 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 17 sections, 16 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: The overall workflow of the proposed Q-SNN.
  • Figure 2: The distribution of spikes $s$ in Q-SNNs, we select the first eight layers in ResNet-19 on CIFAR-10 for display.
  • Figure 3: Comparison of the model size and accuracy between the proposed Q-SNN and existing quantized SNN approaches on the CIFAR-10 dataset.
  • Figure 4: (a) The distribution of synaptic weights in Q-SNN after applying the WS-DR method. (b) The distribution of spike activities in Q-SNN after applying the WS-DR method. These subfigures are plotted based on obtained results in the first eight layers of ResNet19 on the CIFAR-10 dataset.
  • Figure 5: Ablation study for the WS-DR method, where 'FP SNN' denotes the full-precision SNN.