DTA: Dual Temporal-channel-wise Attention for Spiking Neural Networks
Minje Kim, Minjun Kim, Xu Yang
TL;DR
This work addresses the challenge of effectively leveraging temporal information in Spiking Neural Networks (SNNs) by introducing Dual Temporal-channel-wise Attention (DTA), which combines Temporal-channel-wise identical Cross Attention (T-XA) and Temporal-channel-wise Non-identical Attention (T-NA) in a single DTA block. By embedding a DTA block into an MS-ResNet backbone, the authors achieve state-of-the-art performance on static and dynamic datasets (CIFAR10/100, ImageNet-1k, CIFAR10-DVS) with fewer time steps, demonstrating improved spike representations and temporal-channel modeling. The contributions include the first integration of both identical and non-identical attention mechanisms for temporal-channel processing in SNNs, detailed designs for T-XA and T-NA (including LTCA/GTCA components), and comprehensive ablations and visual analyses supporting the effectiveness and efficiency of a single-block attention approach. The work offers a practical pathway to more energy-efficient SNNs with enhanced temporal dynamics, aided by publicly released code.
Abstract
Spiking Neural Networks (SNNs) present a more energy-efficient alternative to Artificial Neural Networks (ANNs) by harnessing spatio-temporal dynamics and event-driven spikes. Effective utilization of temporal information is crucial for SNNs, leading to the exploration of attention mechanisms to enhance this capability. Conventional attention operations either apply identical operation or employ non-identical operations across target dimensions. We identify that these approaches provide distinct perspectives on temporal information. To leverage the strengths of both operations, we propose a novel Dual Temporal-channel-wise Attention (DTA) mechanism that integrates both identical/non-identical attention strategies. To the best of our knowledge, this is the first attempt to concentrate on both the correlation and dependency of temporal-channel using both identical and non-identical attention operations. Experimental results demonstrate that the DTA mechanism achieves state-of-the-art performance on both static datasets (CIFAR10, CIFAR100, ImageNet-1k) and dynamic dataset (CIFAR10-DVS), elevating spike representation and capturing complex temporal-channel relationship. We open-source our code: https://github.com/MnJnKIM/DTA-SNN.
