Table of Contents
Fetching ...

Gaining the Sparse Rewards by Exploring Lottery Tickets in Spiking Neural Network

Hao Cheng, Jiahang Cao, Erjia Xiao, Mengshu Sun, Renjing Xu

TL;DR

A sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp).

Abstract

Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure of deep SNNs can still result in extra energy consumption. The Lottery Ticket Hypothesis (LTH) posits that within dense neural networks, there exist winning Lottery Tickets (LTs), namely sub-networks, that can be obtained without compromising performance. Inspired by this, this paper delves into the spiking-based LTs (SLTs), examining their unique properties and potential for extreme efficiency. Then, two significant sparse \textbf{\textit{Rewards}} are gained through comprehensive explorations and meticulous experiments on SLTs across various dense structures. Moreover, a sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp). MultiSp refers to (1) Patch number sparsity; (2) ConvPEP weights sparsity and binarization; and (3) ConvPEP activation layer binarization. Extensive experiments demonstrate that our method achieves extreme sparsity with only a slight performance decrease, paving the way for deploying energy-efficient neural networks in robotics and beyond.

Gaining the Sparse Rewards by Exploring Lottery Tickets in Spiking Neural Network

TL;DR

A sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp).

Abstract

Deploying energy-efficient deep learning algorithms on computational-limited devices, such as robots, is still a pressing issue for real-world applications. Spiking Neural Networks (SNNs), a novel brain-inspired algorithm, offer a promising solution due to their low-latency and low-energy properties over traditional Artificial Neural Networks (ANNs). Despite their advantages, the dense structure of deep SNNs can still result in extra energy consumption. The Lottery Ticket Hypothesis (LTH) posits that within dense neural networks, there exist winning Lottery Tickets (LTs), namely sub-networks, that can be obtained without compromising performance. Inspired by this, this paper delves into the spiking-based LTs (SLTs), examining their unique properties and potential for extreme efficiency. Then, two significant sparse \textbf{\textit{Rewards}} are gained through comprehensive explorations and meticulous experiments on SLTs across various dense structures. Moreover, a sparse algorithm tailored for spiking transformer structure, which incorporates convolution operations into the Patch Embedding Projection (ConvPEP) module, has been proposed to achieve Multi-level Sparsity (MultiSp). MultiSp refers to (1) Patch number sparsity; (2) ConvPEP weights sparsity and binarization; and (3) ConvPEP activation layer binarization. Extensive experiments demonstrate that our method achieves extreme sparsity with only a slight performance decrease, paving the way for deploying energy-efficient neural networks in robotics and beyond.
Paper Structure (12 sections, 3 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 12 sections, 3 equations, 9 figures, 2 tables, 1 algorithm.

Figures (9)

  • Figure 1: The pipeline of our Spiking Lottery Tickets (SLT). The procedure begins with data preparation, utilizing either RGB or event datasets. It then progresses to the selection of network architectures, with options including CNN-based and transformer-based spiking models, where the rainbow-colored module can be sparsified by our SLT approach. The core of the process involves applying the patch pruning and parameter pruning methods, which yields rewards and returns a refined SNN network that is both energy-efficient and sparsely connected, making it ideal for implementation in resource-limited devices, e.g., robots.
  • Figure 2: The performance changes under different LTs and SLTs due to varying parameter-level and patch-level pruning ratios on CIFAR10.
  • Figure 3: The performance changes under different LTs and SLTs due to varying parameter-level and patch-level pruning ratios on DVS128Gesture.
  • Figure 4: The performance effect under different Timestep and Decay Rate to VGG-9 and TINY in CIFAR10 (C10) and DVS128Gesture (DVS). The P.4 and P.5 indicate the parameter Pruning Ration is 0.4 and 0.5.
  • Figure 5: The left two figures: The impact of different param. pruning ratios ($0.2, 0.3, 0.4, 0.5, 0.6$) on overall performance of patch-lever sparsity. The right two figures: The Fing-Tuning (FT) effect to MPTSLTs, and PEPSp-SLTs and MultiSp-SLTs.
  • ...and 4 more figures