Table of Contents
Fetching ...

Spatial-Temporal Search for Spiking Neural Networks

Kaiwei Che, Zhaokun Zhou, Li Yuan, Jianguo Zhang, Yonghong Tian, Luziwei Leng

TL;DR

Drawing inspiration from the heterogeneity of biological neural networks, a differentiable approach to optimize SNN on both spatial and temporal dimensions is proposed, and a differentiable surrogate gradient search method is proposed to evolve local SG functions independently during training.

Abstract

Spiking Neural Networks (SNNs) are considered as a potential candidate for the next generation of artificial intelligence with appealing characteristics such as sparse computation and inherent temporal dynamics. By adopting architectures of Artificial Neural Networks (ANNs), SNNs achieve competitive performances on benchmark tasks like image classification. However, successful architectures of ANNs are not optimal for SNNs. In this work, we apply Neural Architecture Search (NAS) to find suitable architectures for SNNs. Previous NAS methods for SNNs focus primarily on the spatial dimension, with a notable lack of consideration for the temporal dynamics that are of critical importance for SNNs. Drawing inspiration from the heterogeneity of biological neural networks, we propose a differentiable approach to optimize SNN on both spatial and temporal dimensions. At spatial level, we have developed a spike-based differentiable hierarchical search (SpikeDHS) framework, where spike-based operation is optimized on both the cell and the layer level under computational constraints. We further propose a differentiable surrogate gradient search (DGS) method to evolve local SG functions independently during training. At temporal level, we explore an optimal configuration of diverse temporal dynamics on different types of spiking neurons by evolving their time constants, based on which we further develop hybrid networks combining SNN and ANN, balancing both accuracy and efficiency. Our methods achieve comparable classification performance of CIFAR10/100 and ImageNet with accuracies of 96.43%, 78.96%, and 70.21%, respectively. On event-based deep stereo, our methods find optimal layer variation and surpass the accuracy of specially designed ANNs with 26$\times$ lower computational cost ($6.7\mathrm{mJ}$), demonstrating the potential of SNN in processing highly sparse and dynamic signals.

Spatial-Temporal Search for Spiking Neural Networks

TL;DR

Drawing inspiration from the heterogeneity of biological neural networks, a differentiable approach to optimize SNN on both spatial and temporal dimensions is proposed, and a differentiable surrogate gradient search method is proposed to evolve local SG functions independently during training.

Abstract

Spiking Neural Networks (SNNs) are considered as a potential candidate for the next generation of artificial intelligence with appealing characteristics such as sparse computation and inherent temporal dynamics. By adopting architectures of Artificial Neural Networks (ANNs), SNNs achieve competitive performances on benchmark tasks like image classification. However, successful architectures of ANNs are not optimal for SNNs. In this work, we apply Neural Architecture Search (NAS) to find suitable architectures for SNNs. Previous NAS methods for SNNs focus primarily on the spatial dimension, with a notable lack of consideration for the temporal dynamics that are of critical importance for SNNs. Drawing inspiration from the heterogeneity of biological neural networks, we propose a differentiable approach to optimize SNN on both spatial and temporal dimensions. At spatial level, we have developed a spike-based differentiable hierarchical search (SpikeDHS) framework, where spike-based operation is optimized on both the cell and the layer level under computational constraints. We further propose a differentiable surrogate gradient search (DGS) method to evolve local SG functions independently during training. At temporal level, we explore an optimal configuration of diverse temporal dynamics on different types of spiking neurons by evolving their time constants, based on which we further develop hybrid networks combining SNN and ANN, balancing both accuracy and efficiency. Our methods achieve comparable classification performance of CIFAR10/100 and ImageNet with accuracies of 96.43%, 78.96%, and 70.21%, respectively. On event-based deep stereo, our methods find optimal layer variation and surpass the accuracy of specially designed ANNs with 26 lower computational cost (), demonstrating the potential of SNN in processing highly sparse and dynamic signals.

Paper Structure

This paper contains 39 sections, 25 equations, 8 figures, 11 tables, 3 algorithms.

Figures (8)

  • Figure 1: Spatio-temporal search of SNNs, including SpikeDHS, DGS, and TPS. SpikeDHS performs a hierarchical search for SNNs at both the cell and layer level, with the latter optimizing the resolution of the feature map. A cell contains several nodes (e.g. three, shown here as green circles) whose connections are either within the cell or from previous cells, forming a directed acyclic graph. After receiving all operations (e.g., two here), a node is activated by a spiking neuron. Then, DGS explores the optimal surrogate gradient function for spiking neurons during the training process. Based on the searched architecture, TPS explores the optimal temporal dynamics at the neuron level. In practice, the three methods can be combined or used independently.
  • Figure 2: Spiking activation position. (a) Mixed operation at the spike activation; (b) Mixed operation at the membrane potential. Two operations are shown here as an example.
  • Figure 3: Difference beteen the direct gradient method (a) PLIF and our methods (b) TPS and (c) HNS. PLIF parameterizes temporal parameters through direct gradient optimization. In contrast, TPS combines neurons in a mixed operation, where each neuron's contribution is weighted by a factor ($a$). Over multiple steps, these factors are updated based on gradients, and the neuron with the highest $a$ value is selected as the optimal choice. HNS follows a similar process to TPS but integrates both artificial and spiking neurons.
  • Figure 4: Qualitative comparison on MVSEC. Disparity maps of different methods are on same frames.
  • Figure 5: Search process of TPS on (a) CIFAR100 (b) MVSEC dataset. Each snapshot is retrained from scratch for fewer epochs and evaluated on the validation set. The results are averaged over multiple experiments.
  • ...and 3 more figures