Table of Contents
Fetching ...

Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

Qi Xu, Yuyuan Gao, Jiangrong Shen, Yaxin Li, Xuming Ran, Huajin Tang, Gang Pan

TL;DR

This work tackles the challenge of exploiting temporal context in event-based spatio-temporal data with recurrent spiking neural networks. It introduces SRNN-SCBAM, a framework that fuses Spiking ConvLSTM with a Spiking CBAM attention module to adaptively recall history in both spatial and temporal channels using surrogate-gradient training. The key contributions include the SRNN-CBAM architecture, the design of channel and spatial attention on the forget gate, and extensive ablations and visualizations showing improved memory efficiency and sparse, informative feature extraction; the method achieves competitive accuracy on CIFAR10-DVS and DVS128-Gesture. The approach advances practical, real-time processing of neuromorphic data by enabling targeted memory invocation and reducing redundancy in spiking sequences.

Abstract

Spiking neural networks (SNNs) serve as one type of efficient model to process spatio-temporal patterns in time series, such as the Address-Event Representation data collected from Dynamic Vision Sensor (DVS). Although convolutional SNNs have achieved remarkable performance on these AER datasets, benefiting from the predominant spatial feature extraction ability of convolutional structure, they ignore temporal features related to sequential time points. In this paper, we develop a recurrent spiking neural network (RSNN) model embedded with an advanced spiking convolutional block attention module (SCBAM) component to combine both spatial and temporal features of spatio-temporal patterns. It invokes the history information in spatial and temporal channels adaptively through SCBAM, which brings the advantages of efficient memory calling and history redundancy elimination. The performance of our model was evaluated in DVS128-Gesture dataset and other time-series datasets. The experimental results show that the proposed SRNN-SCBAM model makes better use of the history information in spatial and temporal dimensions with less memory space, and achieves higher accuracy compared to other models.

Enhancing Adaptive History Reserving by Spiking Convolutional Block Attention Module in Recurrent Neural Networks

TL;DR

This work tackles the challenge of exploiting temporal context in event-based spatio-temporal data with recurrent spiking neural networks. It introduces SRNN-SCBAM, a framework that fuses Spiking ConvLSTM with a Spiking CBAM attention module to adaptively recall history in both spatial and temporal channels using surrogate-gradient training. The key contributions include the SRNN-CBAM architecture, the design of channel and spatial attention on the forget gate, and extensive ablations and visualizations showing improved memory efficiency and sparse, informative feature extraction; the method achieves competitive accuracy on CIFAR10-DVS and DVS128-Gesture. The approach advances practical, real-time processing of neuromorphic data by enabling targeted memory invocation and reducing redundancy in spiking sequences.

Abstract

Spiking neural networks (SNNs) serve as one type of efficient model to process spatio-temporal patterns in time series, such as the Address-Event Representation data collected from Dynamic Vision Sensor (DVS). Although convolutional SNNs have achieved remarkable performance on these AER datasets, benefiting from the predominant spatial feature extraction ability of convolutional structure, they ignore temporal features related to sequential time points. In this paper, we develop a recurrent spiking neural network (RSNN) model embedded with an advanced spiking convolutional block attention module (SCBAM) component to combine both spatial and temporal features of spatio-temporal patterns. It invokes the history information in spatial and temporal channels adaptively through SCBAM, which brings the advantages of efficient memory calling and history redundancy elimination. The performance of our model was evaluated in DVS128-Gesture dataset and other time-series datasets. The experimental results show that the proposed SRNN-SCBAM model makes better use of the history information in spatial and temporal dimensions with less memory space, and achieves higher accuracy compared to other models.
Paper Structure (14 sections, 7 equations, 9 figures, 1 table)

This paper contains 14 sections, 7 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: The framework of the proposed RSNN-SCBAM. (a) The network structure of the RSNNs with spiking ConvLSTMs and fully-connected layers. The spiking CBAM can adaptively select the key features both in spatial and temporal domains, and its effect on forgetting gating of spiking ConvLSTMs can invoke the history memories efficiently and eliminate the history redundancy, thus improving the processing of the spatiotemporal patterns. (b) The spiking ConvLSTMs are implemented to exploit the spatiotemporal feature through the proposed spiking convolutional component and spiking LSTM module. (c) The spiking CBAM component captures the sparse and complementary key features on the spatial and channel domain.
  • Figure 2: The visualization of extract features of RSNNs with and without SCBAM. We draw the feature map in spiking Convlstm layer at the time steps of $0_{th}$, $5_{th}$, $10_{th}$, $15_{th}$, and $19_{th}$. The RSNNs with spiking CBAM capture more sparse and complementary features compared with the RSNNs without spiking CBAM.
  • Figure 3: Visualization of input neuromorphic data with a size of 2*128*128. We plot the feature maps of the input image at time steps of $0_{th}$, $5_{th}$, $10_{th}$, $15_{th}$, and $19_{th}$.
  • Figure 4: Experimental parameter settings.
  • Figure 5: Accuracy of different solutions for the DVS128 GESTURE dataset (11 classes).
  • ...and 4 more figures