Sparse by Rule: Probability-Based N:M Pruning for Spiking Neural Networks

Shuhan Ye; Yi Yu; Qixin Zhang; Chenqi Kong; Qiangqiang Wu; Xudong Jiang; Dacheng Tao

Sparse by Rule: Probability-Based N:M Pruning for Spiking Neural Networks

Shuhan Ye, Yi Yu, Qixin Zhang, Chenqi Kong, Qiangqiang Wu, Xudong Jiang, Dacheng Tao

TL;DR

SpikeNM tackles the challenge of pruning deep Spiking Neural Networks (SNNs) by introducing a probabilistic $N{:}M$ semi-structured pruning framework learned from scratch. It uses an $M$-way basis-logit parameterization with a differentiable top-$k$ sampler to linearize per-block search to $ ext{O}(M)$ and couples mask learning to spiking dynamics via Eligibility-Inspired Distillation (EID). The method achieves state-of-the-art or competitive accuracy at $N{:}M$ sparsities such as $2{:}4$ and $2{:}8$ on CIFAR10/100 and neuromorphic datasets, while producing hardware-friendly sparsity patterns and preserving energy efficiency. This approach enables scalable, edge-friendly deployment of sparse SNNs and bridges the gap between unstructured and structured pruning by combining flexibility with accelerator-friendly structure.

Abstract

Brain-inspired Spiking neural networks (SNNs) promise energy-efficient intelligence via event-driven, sparse computation, but deeper architectures inflate parameters and computational cost, hindering their edge deployment. Recent progress in SNN pruning helps alleviate this burden, yet existing efforts fall into only two families: \emph{unstructured} pruning, which attains high sparsity but is difficult to accelerate on general hardware, and \emph{structured} pruning, which eases deployment but lack flexibility and often degrades accuracy at matched sparsity. In this work, we introduce \textbf{SpikeNM}, the first SNN-oriented \emph{semi-structured} $N{:}M$ pruning framework that learns sparse SNNs \emph{from scratch}, enforcing \emph{at most $N$} non-zeros per $M$-weight block. To avoid the combinatorial space complexity $\sum_{k=1}^{N}\binom{M}{k}$ growing exponentially with $M$, SpikeNM adopts an $M$-way basis-logit parameterization with a differentiable top-$k$ sampler, \emph{linearizing} per-block complexity to $\mathcal O(M)$ and enabling more aggressive sparsification. Further inspired by neuroscience, we propose \emph{eligibility-inspired distillation} (EID), which converts temporally accumulated credits into block-wise soft targets to align mask probabilities with spiking dynamics, reducing sampling variance and stabilizing search under high sparsity. Experiments show that at $2{:}4$ sparsity, SpikeNM maintains and even with gains across main-stream datasets, while yielding hardware-amenable patterns that complement intrinsic spike sparsity.

Sparse by Rule: Probability-Based N:M Pruning for Spiking Neural Networks

TL;DR

SpikeNM tackles the challenge of pruning deep Spiking Neural Networks (SNNs) by introducing a probabilistic

semi-structured pruning framework learned from scratch. It uses an

-way basis-logit parameterization with a differentiable top-

sampler to linearize per-block search to

and couples mask learning to spiking dynamics via Eligibility-Inspired Distillation (EID). The method achieves state-of-the-art or competitive accuracy at

sparsities such as

and

on CIFAR10/100 and neuromorphic datasets, while producing hardware-friendly sparsity patterns and preserving energy efficiency. This approach enables scalable, edge-friendly deployment of sparse SNNs and bridges the gap between unstructured and structured pruning by combining flexibility with accelerator-friendly structure.

Abstract

pruning framework that learns sparse SNNs \emph{from scratch}, enforcing \emph{at most

} non-zeros per

-weight block. To avoid the combinatorial space complexity

growing exponentially with

, SpikeNM adopts an

-way basis-logit parameterization with a differentiable top-

sampler, \emph{linearizing} per-block complexity to $\mathcal O(M)$ and enabling more aggressive sparsification. Further inspired by neuroscience, we propose \emph{eligibility-inspired distillation} (EID), which converts temporally accumulated credits into block-wise soft targets to align mask probabilities with spiking dynamics, reducing sampling variance and stabilizing search under high sparsity. Experiments show that at

sparsity, SpikeNM maintains and even with gains across main-stream datasets, while yielding hardware-amenable patterns that complement intrinsic spike sparsity.

Sparse by Rule: Probability-Based N:M Pruning for Spiking Neural Networks

TL;DR

Abstract

Sparse by Rule: Probability-Based N:M Pruning for Spiking Neural Networks

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (1)