Table of Contents
Fetching ...

S$^2$NN: Sub-bit Spiking Neural Networks

Wenjie Wei, Malu Zhang, Jieyuan Zhang, Ammar Belatreche, Shuai Wang, Yimeng Shan, Hanwen Liu, Honglin Cao, Guoqing Wang, Yang Yang, Haizhou Li

TL;DR

S$^2$NN introduces sub-bit Spiking Neural Networks that compress weights to below 1 bit by using layer-specific compact codebooks and index-based representation. The approach pairs an outlier-aware sub-bit weight quantization (OS-Quant) with membrane potential-based feature distillation (MPFD) to address codeword bias and maintain performance. Empirical results across classification, detection, and segmentation show substantial reductions in model size and computation while achieving competitive accuracy, demonstrating strong potential for edge and neuromorphic hardware deployment. The combination of OS-Quant and MPFD yields state-of-the-art compression and efficiency gains with scalable performance across architectures and tasks.

Abstract

Spiking Neural Networks (SNNs) offer an energy-efficient paradigm for machine intelligence, but their continued scaling poses challenges for resource-limited deployment. Despite recent advances in binary SNNs, the storage and computational demands remain substantial for large-scale networks. To further explore the compression and acceleration potential of SNNs, we propose Sub-bit Spiking Neural Networks (S$^2$NNs) that represent weights with less than one bit. Specifically, we first establish an S$^2$NN baseline by leveraging the clustering patterns of kernels in well-trained binary SNNs. This baseline is highly efficient but suffers from \textit{outlier-induced codeword selection bias} during training. To mitigate this issue, we propose an \textit{outlier-aware sub-bit weight quantization} (OS-Quant) method, which optimizes codeword selection by identifying and adaptively scaling outliers. Furthermore, we propose a \textit{membrane potential-based feature distillation} (MPFD) method, improving the performance of highly compressed S$^2$NN via more precise guidance from a teacher model. Extensive results on vision tasks reveal that S$^2$NN outperforms existing quantized SNNs in both performance and efficiency, making it promising for edge computing applications.

S$^2$NN: Sub-bit Spiking Neural Networks

TL;DR

SNN introduces sub-bit Spiking Neural Networks that compress weights to below 1 bit by using layer-specific compact codebooks and index-based representation. The approach pairs an outlier-aware sub-bit weight quantization (OS-Quant) with membrane potential-based feature distillation (MPFD) to address codeword bias and maintain performance. Empirical results across classification, detection, and segmentation show substantial reductions in model size and computation while achieving competitive accuracy, demonstrating strong potential for edge and neuromorphic hardware deployment. The combination of OS-Quant and MPFD yields state-of-the-art compression and efficiency gains with scalable performance across architectures and tasks.

Abstract

Spiking Neural Networks (SNNs) offer an energy-efficient paradigm for machine intelligence, but their continued scaling poses challenges for resource-limited deployment. Despite recent advances in binary SNNs, the storage and computational demands remain substantial for large-scale networks. To further explore the compression and acceleration potential of SNNs, we propose Sub-bit Spiking Neural Networks (SNNs) that represent weights with less than one bit. Specifically, we first establish an SNN baseline by leveraging the clustering patterns of kernels in well-trained binary SNNs. This baseline is highly efficient but suffers from \textit{outlier-induced codeword selection bias} during training. To mitigate this issue, we propose an \textit{outlier-aware sub-bit weight quantization} (OS-Quant) method, which optimizes codeword selection by identifying and adaptively scaling outliers. Furthermore, we propose a \textit{membrane potential-based feature distillation} (MPFD) method, improving the performance of highly compressed SNN via more precise guidance from a teacher model. Extensive results on vision tasks reveal that SNN outperforms existing quantized SNNs in both performance and efficiency, making it promising for edge computing applications.

Paper Structure

This paper contains 44 sections, 12 equations, 6 figures, 12 tables, 1 algorithm.

Figures (6)

  • Figure 1: (a) Convolutional kernels in well-trained BSNNs exhibit clustering patterns. This motivates us to achieve higher compression ratios than BSNNs by using a compact codebook $\mathbb{P}$ instead of the full codebook $\mathbb{K}$. (b) The constructed S$^2$NN baseline.
  • Figure 2: (a) The outliers dominate the distance calculations, diminishing the contributions of other elements and leading the baseline to select an undesirable codeword for inference. (b) During the training process of the S$^2$NN baseline, we randomly sample several kernels and analyze their weight distributions using a box plot. Discrete points in the figure indicate outliers.
  • Figure 3: Schematic diagram of the proposed OS-Quant and MPFD method.
  • Figure 4: Detection and segmentation visualization of S$^2$NN on COCO 2017 and ADE20K.
  • Figure 5: Comparison of compression and acceleration between standard BSNN and our S$^2$NN during the convolution process.
  • ...and 1 more figures