Table of Contents
Fetching ...

The Backpropagation Algorithm Implemented on Spiking Neuromorphic Hardware

Alpha Renner, Forrest Sheldon, Anatoly Zlotnik, Louis Tao, Andrew Sornborger

TL;DR

This study presents a neuromorphic, spiking backpropagation algorithm based on synfire-gated dynamical information coordination and processing implemented on Intel’s Loihi neuromorphic research processor and demonstrates a proof-of-principle three-layer circuit that learns to classify digits and clothing items from the MNIST and Fashion MNIST datasets.

Abstract

The capabilities of natural neural systems have inspired new generations of machine learning algorithms as well as neuromorphic very large-scale integrated (VLSI) circuits capable of fast, low-power information processing. However, it has been argued that most modern machine learning algorithms are not neurophysiologically plausible. In particular, the workhorse of modern deep learning, the backpropagation algorithm, has proven difficult to translate to neuromorphic hardware. In this study, we present a neuromorphic, spiking backpropagation algorithm based on synfire-gated dynamical information coordination and processing, implemented on Intel's Loihi neuromorphic research processor. We demonstrate a proof-of-principle three-layer circuit that learns to classify digits from the MNIST dataset. To our knowledge, this is the first work to show a Spiking Neural Network (SNN) implementation of the backpropagation algorithm that is fully on-chip, without a computer in the loop. It is competitive in accuracy with off-chip trained SNNs and achieves an energy-delay product suitable for edge computing. This implementation shows a path for using in-memory, massively parallel neuromorphic processors for low-power, low-latency implementation of modern deep learning applications.

The Backpropagation Algorithm Implemented on Spiking Neuromorphic Hardware

TL;DR

This study presents a neuromorphic, spiking backpropagation algorithm based on synfire-gated dynamical information coordination and processing implemented on Intel’s Loihi neuromorphic research processor and demonstrates a proof-of-principle three-layer circuit that learns to classify digits and clothing items from the MNIST and Fashion MNIST datasets.

Abstract

The capabilities of natural neural systems have inspired new generations of machine learning algorithms as well as neuromorphic very large-scale integrated (VLSI) circuits capable of fast, low-power information processing. However, it has been argued that most modern machine learning algorithms are not neurophysiologically plausible. In particular, the workhorse of modern deep learning, the backpropagation algorithm, has proven difficult to translate to neuromorphic hardware. In this study, we present a neuromorphic, spiking backpropagation algorithm based on synfire-gated dynamical information coordination and processing, implemented on Intel's Loihi neuromorphic research processor. We demonstrate a proof-of-principle three-layer circuit that learns to classify digits from the MNIST dataset. To our knowledge, this is the first work to show a Spiking Neural Network (SNN) implementation of the backpropagation algorithm that is fully on-chip, without a computer in the loop. It is competitive in accuracy with off-chip trained SNNs and achieves an energy-delay product suitable for edge computing. This implementation shows a path for using in-memory, massively parallel neuromorphic processors for low-power, low-latency implementation of modern deep learning applications.

Paper Structure

This paper contains 33 sections, 18 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Overview of conceptual circuit architecture. Feedforward activations of input (${x}$), hidden (${h}$) and output (${o}$) layers are calculated by a feedforward module. Errors (${e} = {t} - {o}$) are calculated from the output and the training signal (${t}$). Errors are backpropagated through a feedback module with the same weights $W_2$ for synapses between ${h}$ and ${o}$, but in the opposite direction (mathematically expressed as the transpose, $W_2^T$). Local gradients (${d}_{1},\,{d}_{2}$) are gated back into the feedforward circuit at appropriate times to accomplish potentiation or depression of appropriate weights.
  • Figure 2: Functional connectivity of the 2 layer backpropagation circuit. Layers are only shown when they are gated 'on' and synapses are only shown when their target is gated on. Plastic connections are all-to-all (fully connected), i.e. all neurons are connected to all neurons of the next layer. The gating connections from the gating chain are one-to-all, and all other connections are one-to-one, which means that a firing pattern is copied directly to the following layer. The names of the neuron layers are given on the left margin so that the row corresponds to layer identity. The columns correspond to time steps of the algorithm, which are the same as the time steps on Loihi. Table \ref{['tableMNISTcircuit']} shows the information contained in each layer in each respective time step. The red background in time steps 5 and 7 indicates that in these steps, the sign of the weight update is inverted (positive), as $r=1$ in Eq. \ref{['eq:learning_rule_loihi']}. A detailed step-by-step explanation of the algorithm is given in Section \ref{['steps']} and in Table \ref{['tableMNISTcircuit']} in the supplementary material. The plot in the top left corner illustrates our approach to approximate the activation function $f$ by a surrogate with the box function as derivative, $f_{\text{surr}}=H(x)H(1-x)$, where $f'$ is the rectified linear map (ReLU) (see Equations \ref{['eq:binary_thr']}, \ref{['eq:trelu']} and \ref{['eq:box']}).
  • Figure 3: Accuracy and loss (mean squared error) over epochs. Note separate axis scaling for accuracy (left) and loss (right).
  • Figure 4: Anatomical connectivity of the 2 layer backpropagation circuit. While in the Loihi implementation the $o$ layers are connected directly go to the $d_2$ layers, here an intermediate fictional $b_o$ layer is added for easier understanding. Arrows that end on the border of a box that encompasses several layers go to each of the layers. The gating chain is not shown, but the small numbers on top of each layer indicate when it is gated on. Colors are the same as in Fig. \ref{['fig:FuncConn']}.
  • Figure 5: Example raster plot of the spikes over six gating cycles. All populations of the same size are plotted in the same plot and only the first 50 neurons are plotted per layer. To avoid occlusion, a small offset in time is added to the time step of some layers. Refer to Tab. \ref{['tableMNISTcircuit']} for a detailed explanation of the spike propagation. (1) error (target but no output spike) leads to potentiation of the $W_2$ synaptic weight and the positive transpose; (2) the same error leads to depression of the negative transpose via activity of $d_1$; (3) no error because $o$ and $t$ fire at the same location, i.e. there is no update in this iteration; (4) there is an error ($t$ fires at index 4, but $o$ at index 7), but the local gradient is 0 because it is gated 'off' at index 7 because the derivative of the activation function is 0, i.e. both $o^<$ and $o^>$ fire. Also, it is not gated 'on' at index 4, because $o^<$ does not fire; (5) local gradient (output but not target), leads to potentiation of the weight of the synapses from $o^{T-}$ to $d_1$ (red), and (6) depression of $h-o$ and $d_1-o$ synaptic weights; (7) The orange spikes show the back-propagated local gradient from (1) which leads to potentiation of the $x-h$ weights. Note that for visualization purposes, the gating from $b_h$ is applied one time step later directly to $h, h^<$ and $h^>$. That is, the orange spikes in time step 7 are the full backpropagated error, but only the neurons that are also gated 'on' by the combination of $h^<$ and $h^>$ are actually active in the potentiation phase in time step 8. (8) Same as (7), but the error from (2) leads to depression of the $x-h$ weights.