Table of Contents
Fetching ...

Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications

Tong Bu, Maohua Li, Zhaofei Yu

TL;DR

The paper tackles the challenge of efficiently converting pre-trained ANNs to high-performance SNNs with minimal training cost by introducing inference-scale conversion techniques. It combines a theoretical error bound with practical threshold optimization (local and channel-wise) and a delayed evaluation strategy to mitigate spike-delay effects, enabling fast, low-power inference. The approach demonstrates strong performance on image classification and extends to semantic segmentation, object detection, and video tasks, achieving notable energy-efficiency advantages (e.g., ~622 FPS/W) while requiring far less training data and compute than retraining-based methods. This work offers a practical path for deploying SNNs on neuromorphic hardware, enabling fast, low-power AI with negligible performance loss relative to ANN baselines.

Abstract

Spiking Neural Networks (SNNs) have emerged as a promising substitute for Artificial Neural Networks (ANNs) due to their advantages of fast inference and low power consumption. However, the lack of efficient training algorithms has hindered their widespread adoption. Even efficient ANN-SNN conversion methods necessitate quantized training of ANNs to enhance the effectiveness of the conversion, incurring additional training costs. To address these challenges, we propose an efficient ANN-SNN conversion framework with only inference scale complexity. The conversion framework includes a local threshold balancing algorithm, which enables efficient calculation of the optimal thresholds and fine-grained adjustment of the threshold value by channel-wise scaling. We also introduce an effective delayed evaluation strategy to mitigate the influence of the spike propagation delays. We demonstrate the scalability of our framework in typical computer vision tasks: image classification, semantic segmentation, object detection, and video classification. Our algorithm outperforms existing methods, highlighting its practical applicability and efficiency. Moreover, we have evaluated the energy consumption of the converted SNNs, demonstrating their superior low-power advantage compared to conventional ANNs. This approach simplifies the deployment of SNNs by leveraging open-source pre-trained ANN models, enabling fast, low-power inference with negligible performance reduction. Code is available at https://github.com/putshua/Inference-scale-ANN-SNN.

Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications

TL;DR

The paper tackles the challenge of efficiently converting pre-trained ANNs to high-performance SNNs with minimal training cost by introducing inference-scale conversion techniques. It combines a theoretical error bound with practical threshold optimization (local and channel-wise) and a delayed evaluation strategy to mitigate spike-delay effects, enabling fast, low-power inference. The approach demonstrates strong performance on image classification and extends to semantic segmentation, object detection, and video tasks, achieving notable energy-efficiency advantages (e.g., ~622 FPS/W) while requiring far less training data and compute than retraining-based methods. This work offers a practical path for deploying SNNs on neuromorphic hardware, enabling fast, low-power AI with negligible performance loss relative to ANN baselines.

Abstract

Spiking Neural Networks (SNNs) have emerged as a promising substitute for Artificial Neural Networks (ANNs) due to their advantages of fast inference and low power consumption. However, the lack of efficient training algorithms has hindered their widespread adoption. Even efficient ANN-SNN conversion methods necessitate quantized training of ANNs to enhance the effectiveness of the conversion, incurring additional training costs. To address these challenges, we propose an efficient ANN-SNN conversion framework with only inference scale complexity. The conversion framework includes a local threshold balancing algorithm, which enables efficient calculation of the optimal thresholds and fine-grained adjustment of the threshold value by channel-wise scaling. We also introduce an effective delayed evaluation strategy to mitigate the influence of the spike propagation delays. We demonstrate the scalability of our framework in typical computer vision tasks: image classification, semantic segmentation, object detection, and video classification. Our algorithm outperforms existing methods, highlighting its practical applicability and efficiency. Moreover, we have evaluated the energy consumption of the converted SNNs, demonstrating their superior low-power advantage compared to conventional ANNs. This approach simplifies the deployment of SNNs by leveraging open-source pre-trained ANN models, enabling fast, low-power inference with negligible performance reduction. Code is available at https://github.com/putshua/Inference-scale-ANN-SNN.
Paper Structure (32 sections, 4 theorems, 31 equations, 6 figures, 5 tables, 2 algorithms)

This paper contains 32 sections, 4 theorems, 31 equations, 6 figures, 5 tables, 2 algorithms.

Key Result

Theorem 1

The layer-wise conversion error can be divided into intra-layer and inter-layer errors: Given that both ANN and SNN models receive the same input in the first layer, leading to $e^0 = 0$, the upper bound for the conversion error between ANN and SNN models in an $L$-layer fully-connected network is given by

Figures (6)

  • Figure 1: Compared to existing frameworks that require retraining a quantized ANN, the proposed framework is able to directly convert a pre-trained ANN to an SNN at inference-scale complexity, significantly reducing computational requirements, minimizing dependence on GPUs, and requiring only a small subset of the original dataset.
  • Figure 2: Illustration of the proposed conversion framework. It requires only a small subset of data for conversion using local threshold balancing. During inference, the delayed evaluation technique is employed to enhance the accuracy of output estimation.
  • Figure 3: (a) Results on evaluating the impact of different techniques within the conversion framework. (b) Effect of iteration steps on SNN performance after conversion.
  • Figure 4: Effect of delayed steps
  • Figure S1: Illustration for detection examples of SNNs on different inference steps
  • ...and 1 more figures

Theorems & Definitions (7)

  • Theorem 1
  • Theorem 2
  • Theorem 1
  • proof 1: error bound
  • proof 2
  • Theorem 2
  • proof 3