Exploring the Limitations of Layer Synchronization in Spiking Neural Networks

Roel Koopman; Amirreza Yousefzadeh; Mahyar Shahsavari; Guangzhi Tang; Manolis Sifalakis

Exploring the Limitations of Layer Synchronization in Spiking Neural Networks

Roel Koopman, Amirreza Yousefzadeh, Mahyar Shahsavari, Guangzhi Tang, Manolis Sifalakis

TL;DR

This paper investigates the mismatch between layer-synchronized training and end-to-end asynchronous inference in spiking neural networks (SNNs). It introduces unlayered backpropagation, a training approach that incorporates asynchronous neuron scheduling and vectorized execution to prepare models for true event-driven processing. Across multiple spatio-temporal benchmarks, asynchronous training can recover or improve accuracy, dramatically reduce spike counts (up to $50\%$), and halve inference latency (up to $2\times$ faster), albeit with substantial training-time costs. The work highlights the need for co-design of training algorithms and neuromorphic hardware to unlock real energy- and latency-efficient AI systems.

Abstract

Neural-network processing in machine learning applications relies on layer synchronization. This is practiced even in artificial Spiking Neural Networks (SNNs), which are touted as consistent with neurobiology, in spite of processing in the brain being in fact asynchronous. A truly asynchronous system however would allow all neurons to evaluate concurrently their threshold and emit spikes upon receiving any presynaptic current. Omitting layer synchronization is potentially beneficial, for latency and energy efficiency, but asynchronous execution of models previously trained with layer synchronization may entail a mismatch in network dynamics and performance. We present and quantify this problem, and show that models trained with layer synchronization either perform poorly in absence of the synchronization, or fail to benefit from any energy and latency reduction, when such a mechanism is in place. We then explore a potential solution direction, based on a generalization of backpropagation-based training that integrates knowledge about an asynchronous execution scheduling strategy, for learning models suitable for asynchronous processing. We experiment with two asynchronous neuron execution scheduling strategies in datasets that encode spatial and temporal information, and we show the potential of asynchronous processing to use less spikes (up to 50%), complete inference faster (up to 2x), and achieve competitive or even better accuracy (up to 10% higher). Our exploration affirms that asynchronous event-based AI processing can be indeed more efficient, but we need to rethink how we train our SNN models to benefit from it. (Source code available at: https://github.com/RoelMK/asynctorch)

Exploring the Limitations of Layer Synchronization in Spiking Neural Networks

TL;DR

), and halve inference latency (up to

faster), albeit with substantial training-time costs. The work highlights the need for co-design of training algorithms and neuromorphic hardware to unlock real energy- and latency-efficient AI systems.

Abstract

Paper Structure (37 sections, 10 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 37 sections, 10 equations, 12 figures, 7 tables, 1 algorithm.

Introduction
Related work
Methods for simulating and training of asynchronous SNNs
Simulating asynchronous SNNs
Asynchronous inference with event-driven state updates
Two resolutions of time
Vectorized network asynchrony
Imitating neuromorphic accelerator hardware
Training asynchronous SNNs
Unlayered backpropagation
Regularization techniques
Results
Experimental setup
Network asynchrony increases neuron reactivity
Unlayered backpropagation recovers accuracy and increases sparsity
...and 22 more sections

Figures (12)

Figure 1: (A) Event-based processing. (B) Event-based processing with per-layer synchronization: latency is a function of the events in the system as well as the number of layers (synchronization barriers). (C) Asynchronous (end-to-end) event-based processing: event processing does not encounter synchronization barriers, and so inference latency is determined solely by the subset of the total number of events processed until a decision is made at the output layer. Color codes indicate at which layer events were fired. The x-axis of the plots represents time/order that events were processed. The vertical planes in (B) represent layer-synchronization barriers (also indicating bottlenecks in network routing and memory I/O). The difference between (B) and (C) can be understood through an analogy: bikes racing in a circuit versus on a public road with a sequence of traffic lights.
Figure 2: Layer synchronization ensures that activations from all neurons ($x_i$) in a layer are available simultaneously to any post-synaptic neuron ($y$) in the next layer. Hence they are integrated together and the threshold of the post-synaptic neuron is evaluated only once (see top-left graph). As a result the currents arriving from different synapses often cancel each other out given opposite sign weights, resulting in fewer spikes and a different firing pattern than when each synaptic current is processed independently (see bottom-left graph), where each input is more likely to trigger a spike, as is the case in asynchronous processing. Thus, in SNNs, layer synchronization leads to the volume of spikes being reduced and the activation flow dynamics being dulled. This is not an issue (at least not as grave) in ANNs because activations communicate relative magnitudes.
Figure 3: (a) The forward pass with layer synchronization, where layer $0 \leq j < L$ receives spike input $s_j$, creating currents that are integrated into membrane potentials $u_j$, which are carried over to the next timestep, and potentially emitting new spikes to the next layer if exceeding the spike threshold. (b) The backward pass of a single timestep for layered backpropagation. The structure of the computational graph is fixed across timesteps for a network with constant depth. (c) The forward pass for unlayered backpropagation, where $s_{\text{selected}}^i$ represents a selection of spikes from anywhere in the network propagated during the $i$th forward step, and $U$ denotes the membrane potentials of all neurons in the network. The membrane potentials are carried over to the next timestep at the end of the forward pass. Spikes from the last layer are pulled out of the loop as output. (d) The backward pass for unlayered backpropagation of a single timestep that consisted of $n+1$ forwards steps. Unlike layered backpropagation, the structure of the computational graph varies across timesteps.
Figure 4: (Top row) Number of currents integrated by a neuron before spiking, recorded per neuron and per forward pass for all samples and neurons (excluding the neurons in the input layer). The Y-axis shows the relative frequency of the number of currents integrated before spiking. (Bottom row) Mean number of spikes per neuron during inference of all samples in the test. Error bars show the 25th and 75th percentiles. (Models in this figure were trained with layered backpropagation.)
Figure 5: Accuracy as function of forward steps after the first spike in the output layer. Given that $F=8$, each extra forward step processes another 8 spikes, assuming enough spikes are available. Dashed lines show the accuracy after all spike activity has been "drained" out of the network.
...and 7 more figures

Exploring the Limitations of Layer Synchronization in Spiking Neural Networks

TL;DR

Abstract

Exploring the Limitations of Layer Synchronization in Spiking Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (12)