Table of Contents
Fetching ...

Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks

Xuerui Qiu, Rui-Jie Zhu, Yuhong Chou, Zhaorui Wang, Liang-jian Deng, Guoqi Li

TL;DR

This work addresses the limited temporal dynamics and hardware-inefficient attention in deep spiking neural networks (SNNs) by introducing Gated Attention Coding (GAC), a plug-and-play encoder that produces powerful spatio-temporal representations while preserving the spike-driven nature of SNNs. GAC uses a multi-dimensional gated attention unit (GAU) to fuse temporal and spatial-channel cues, forming an encoder that feeds a spike-based backbone (MS-ResNet), enabling efficient neuromorphic deployment. Theoretical analysis via an observer model and energy accounting demonstrates extended encoding dynamics and reduced energy consumption, while experiments on CIFAR10/100 and ImageNet show state-of-the-art accuracy with substantially fewer time steps and lower energy than prior methods. The results indicate that decoupling the encoder and applying attention in the preprocessing stage can unlock both performance and efficiency gains for large-scale SNNs, with practical implications for energy-efficient neuromorphic hardware.

Abstract

Spiking neural networks (SNNs) are emerging as an energy-efficient alternative to traditional artificial neural networks (ANNs) due to their unique spike-based event-driven nature. Coding is crucial in SNNs as it converts external input stimuli into spatio-temporal feature sequences. However, most existing deep SNNs rely on direct coding that generates powerless spike representation and lacks the temporal dynamics inherent in human vision. Hence, we introduce Gated Attention Coding (GAC), a plug-and-play module that leverages the multi-dimensional gated attention unit to efficiently encode inputs into powerful representations before feeding them into the SNN architecture. GAC functions as a preprocessing layer that does not disrupt the spike-driven nature of the SNN, making it amenable to efficient neuromorphic hardware implementation with minimal modifications. Through an observer model theoretical analysis, we demonstrate GAC's attention mechanism improves temporal dynamics and coding efficiency. Experiments on CIFAR10/100 and ImageNet datasets demonstrate that GAC achieves state-of-the-art accuracy with remarkable efficiency. Notably, we improve top-1 accuracy by 3.10\% on CIFAR100 with only 6-time steps and 1.07\% on ImageNet while reducing energy usage to 66.9\% of the previous works. To our best knowledge, it is the first time to explore the attention-based dynamic coding scheme in deep SNNs, with exceptional effectiveness and efficiency on large-scale datasets.The Code is available at https://github.com/bollossom/GAC.

Gated Attention Coding for Training High-performance and Efficient Spiking Neural Networks

TL;DR

This work addresses the limited temporal dynamics and hardware-inefficient attention in deep spiking neural networks (SNNs) by introducing Gated Attention Coding (GAC), a plug-and-play encoder that produces powerful spatio-temporal representations while preserving the spike-driven nature of SNNs. GAC uses a multi-dimensional gated attention unit (GAU) to fuse temporal and spatial-channel cues, forming an encoder that feeds a spike-based backbone (MS-ResNet), enabling efficient neuromorphic deployment. Theoretical analysis via an observer model and energy accounting demonstrates extended encoding dynamics and reduced energy consumption, while experiments on CIFAR10/100 and ImageNet show state-of-the-art accuracy with substantially fewer time steps and lower energy than prior methods. The results indicate that decoupling the encoder and applying attention in the preprocessing stage can unlock both performance and efficiency gains for large-scale SNNs, with practical implications for energy-efficient neuromorphic hardware.

Abstract

Spiking neural networks (SNNs) are emerging as an energy-efficient alternative to traditional artificial neural networks (ANNs) due to their unique spike-based event-driven nature. Coding is crucial in SNNs as it converts external input stimuli into spatio-temporal feature sequences. However, most existing deep SNNs rely on direct coding that generates powerless spike representation and lacks the temporal dynamics inherent in human vision. Hence, we introduce Gated Attention Coding (GAC), a plug-and-play module that leverages the multi-dimensional gated attention unit to efficiently encode inputs into powerful representations before feeding them into the SNN architecture. GAC functions as a preprocessing layer that does not disrupt the spike-driven nature of the SNN, making it amenable to efficient neuromorphic hardware implementation with minimal modifications. Through an observer model theoretical analysis, we demonstrate GAC's attention mechanism improves temporal dynamics and coding efficiency. Experiments on CIFAR10/100 and ImageNet datasets demonstrate that GAC achieves state-of-the-art accuracy with remarkable efficiency. Notably, we improve top-1 accuracy by 3.10\% on CIFAR100 with only 6-time steps and 1.07\% on ImageNet while reducing energy usage to 66.9\% of the previous works. To our best knowledge, it is the first time to explore the attention-based dynamic coding scheme in deep SNNs, with exceptional effectiveness and efficiency on large-scale datasets.The Code is available at https://github.com/bollossom/GAC.
Paper Structure (30 sections, 3 theorems, 17 equations, 7 figures, 7 tables)

This paper contains 30 sections, 3 theorems, 17 equations, 7 figures, 7 tables.

Key Result

Proposition 1

Given same {Conv-BN} parameters, denoting the dynamic duration of GAC as $\boldsymbol T_{g}$ and direct coding's as $\boldsymbol T_{d}$, and $\boldsymbol T_{g} \geq \boldsymbol T_{d}$

Figures (7)

  • Figure 1: How our Gated Attention Coding (GAC) differs from existing SNNs' coding wu2019direct and attention methods yao2021temporalyao2023attention. In (a), the solid-colored cube represents the float values, the gray cube denotes the binary spike values, and the cube with the dotted line represents the sparse values. In comparison with direct coding, GAC generates spatio-temporal dynamics output with powerful representation. In (b), compared to other attention methods, GAC only adds the attention module to the encoder without requiring $N$ Multiply-Accumulation (MAC) blocks for dynamically calculating attention scores in subsequent layers.
  • Figure 2: The GAC-SNN framework consists of two main components: an encoder and an architecture. In (a), we introduce the encoder, i.e., the GAC module. (b) focuses on the GAU, which acts as the fundamental building block of the GAC layer. It comprises Temporal Attention, Spatial Channel Attention, and Gating sub-modules. (c) Common SNN ResNet architectures. The Conv layer in SEW-ResNet uses a multiply-accumulate operator, not spike computations. Spiking-ResNet retains its spike-driven nature via direct coding, while GAC disrupts it. More details can be seen in discussions. MS-ResNet avoids floating-point multiplications, preserving its spike-driven nature. Hence, we use the MS-ResNet to benefit from neuromorphic implementations.
  • Figure 3: Visualization results. (a) Original image. (b)(c)(d)(e) Encoding results of the direct coding (top) and GAC (bottom) at different time steps. Compared to direct coding, GAC enhances dynamics by introducing variations at each time step.
  • Figure 4: Ablation study on CIFAR100.
  • Figure 5: Firing rate advantage on the ImageNet dataset.
  • ...and 2 more figures

Theorems & Definitions (9)

  • Definition 1
  • Definition 2
  • Definition 3
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof