Table of Contents
Fetching ...

Faster and Stronger: When ANN-SNN Conversion Meets Parallel Spiking Calculation

Zecheng Hao, Qichao Ma, Kang Chen, Yi Zhang, Zhaofei Yu, Tiejun Huang

TL;DR

This work introduces a universal parallel conversion framework that fuses ANN-SNN conversion with parallel spiking calculation to achieve lossless mappings between ANN activations and SNN firing, enabling ultra-low-latency inference. It builds a mathematically grounded mapping via a final parallel conversion matrix and shift terms, and extends it with Distribution-Aware QCFS (DA-QCFS) to handle arbitrary data distributions. An optimization based on binary search reduces computation from quadratic to near-linear in time, significantly boosting throughput while preserving accuracy, as demonstrated on ImageNet-1k and CIFAR benchmarks with substantial speedups. The approach yields competitive or superior performance at low latency, including training-free variants with minimal accuracy loss, offering a practical, scalable pathway for SNN deployment on neuromorphic hardware. Code is available at the authors' GitHub repository.

Abstract

Spiking Neural Network (SNN), as a brain-inspired and energy-efficient network, is currently facing the pivotal challenge of exploring a suitable and efficient learning framework. The predominant training methodologies, namely Spatial-Temporal Back-propagation (STBP) and ANN-SNN Conversion, are encumbered by substantial training overhead or pronounced inference latency, which impedes the advancement of SNNs in scaling to larger networks and navigating intricate application domains. In this work, we propose a novel parallel conversion learning framework, which establishes a mathematical mapping relationship between each time-step of the parallel spiking neurons and the cumulative spike firing rate. We theoretically validate the lossless and sorting properties of the conversion process, as well as pointing out the optimal shifting distance for each step. Furthermore, by integrating the above framework with the distribution-aware error calibration technique, we can achieve efficient conversion towards more general activation functions or training-free circumstance. Extensive experiments have confirmed the significant performance advantages of our method for various conversion cases under ultra-low time latency. To our best knowledge, this is the first work which jointly utilizes parallel spiking calculation and ANN-SNN Conversion, providing a highly promising approach for SNN supervised training. Code is available at https://github.com/hzc1208/Parallel_Conversion.

Faster and Stronger: When ANN-SNN Conversion Meets Parallel Spiking Calculation

TL;DR

This work introduces a universal parallel conversion framework that fuses ANN-SNN conversion with parallel spiking calculation to achieve lossless mappings between ANN activations and SNN firing, enabling ultra-low-latency inference. It builds a mathematically grounded mapping via a final parallel conversion matrix and shift terms, and extends it with Distribution-Aware QCFS (DA-QCFS) to handle arbitrary data distributions. An optimization based on binary search reduces computation from quadratic to near-linear in time, significantly boosting throughput while preserving accuracy, as demonstrated on ImageNet-1k and CIFAR benchmarks with substantial speedups. The approach yields competitive or superior performance at low latency, including training-free variants with minimal accuracy loss, offering a practical, scalable pathway for SNN deployment on neuromorphic hardware. Code is available at the authors' GitHub repository.

Abstract

Spiking Neural Network (SNN), as a brain-inspired and energy-efficient network, is currently facing the pivotal challenge of exploring a suitable and efficient learning framework. The predominant training methodologies, namely Spatial-Temporal Back-propagation (STBP) and ANN-SNN Conversion, are encumbered by substantial training overhead or pronounced inference latency, which impedes the advancement of SNNs in scaling to larger networks and navigating intricate application domains. In this work, we propose a novel parallel conversion learning framework, which establishes a mathematical mapping relationship between each time-step of the parallel spiking neurons and the cumulative spike firing rate. We theoretically validate the lossless and sorting properties of the conversion process, as well as pointing out the optimal shifting distance for each step. Furthermore, by integrating the above framework with the distribution-aware error calibration technique, we can achieve efficient conversion towards more general activation functions or training-free circumstance. Extensive experiments have confirmed the significant performance advantages of our method for various conversion cases under ultra-low time latency. To our best knowledge, this is the first work which jointly utilizes parallel spiking calculation and ANN-SNN Conversion, providing a highly promising approach for SNN supervised training. Code is available at https://github.com/hzc1208/Parallel_Conversion.

Paper Structure

This paper contains 16 sections, 1 theorem, 12 equations, 3 figures, 5 tables.

Key Result

Theorem 4.1

For a $T$-steps parallel inference in the $l$-th layer, we use $\mathbf{b}^l$ to denote the corresponding shift term, here $\mathbf{b}^l \in \mathbb{R}^T$. When the pretrained ANN adopts QCFS function in Eq.(eq04), for the following cases, we will derive the optimal value of the shift term: $\mathbf

Figures (3)

  • Figure 1: The overall framework of parallel conversion. Here (a) depicts the activation functions in ANNs, (b) shows the sorting property of parallel spiking neurons in the firing phase, and (c) describes the specific process of parallel inference.
  • Figure 2: Comparison of parallel and serial inference speeds on ImageNet-1k dataset.
  • Figure S1: Comparison of parallel/serial inference speeds and performance on QCFS ANN models.

Theorems & Definitions (2)

  • Theorem 4.1
  • proof