Scalable Network Emulation on Analog Neuromorphic Hardware

Elias Arnold; Philipp Spilger; Jan V. Straub; Eric Müller; Dominik Dold; Gabriele Meoni; Johannes Schemmel

Scalable Network Emulation on Analog Neuromorphic Hardware

Elias Arnold, Philipp Spilger, Jan V. Straub, Eric Müller, Dominik Dold, Gabriele Meoni, Johannes Schemmel

TL;DR

The ability to emulate and train networks larger than the substrate provides a pathway for accurate performance evaluation in planned or scaled systems, ultimately advancing the development and understanding of large-scale models and neuromorphic computing architectures.

Abstract

We present a novel software feature for the BrainScaleS-2 accelerated neuromorphic platform that facilitates the partitioned emulation of large-scale spiking neural networks. This approach is well suited for deep spiking neural networks and allows for sequential model emulation on undersized neuromorphic resources if the largest recurrent subnetwork and the required neuron fan-in fit on the substrate. The ability to emulate and train networks larger than the substrate provides a pathway for accurate performance evaluation in planned or scaled systems, ultimately advancing the development and understanding of large-scale models and neuromorphic computing architectures. We demonstrate the training of two deep spiking neural network models -- using the MNIST and EuroSAT datasets -- that exceed the physical size constraints of a single-chip BrainScaleS-2 system.

Scalable Network Emulation on Analog Neuromorphic Hardware

TL;DR

Abstract

Paper Structure (11 sections, 2 equations, 4 figures, 3 tables)

This paper contains 11 sections, 2 equations, 4 figures, 3 tables.

Introduction
Methods
Training
MNIST
EuroSAT
Results
Software
Examples
MNIST
EuroSAT
Discussion

Figures (4)

Figure 1: (A) A photo of the chip with its schematic overlaid on top. (B) A larger-scale network, exceeding the size of a single substrate. To emulate the full network, it can be partitioned into smaller subnetworks and executed concurrently on a multi-chip setup as displayed in (C) or all subnetworks are emulated sequentially by reusing the same chip resource. The concept of sequential execution also applies to networks that exceed scaled multi-chip system in size where the scaled system then becomes the largest sequentially allocatable entity. Dashed lines correspond to recurrent dependencies. (D) Upper: On convolutions need to be unrolled spatially thereby demanding excessive hardware resources and partitioning. Here, $W$ and $H$ corresponds to the width and height of the kernel, $C^\text{i}$ and $C^\text{o}$ are the number of input resp. output feature planes. Lower: For sequential network emulation, recurrent dependencies need to fit on a single substrate which reduces external fan-in. However, this limitation does not apply for concurrent network emulation. (E) Software of explicitly partitioned network indicated by the dotted red line in (B) and (C).
Figure 2: (A) Schematic network topology for a network of 28×28 $\rightarrow$ 256 $\rightarrow$ 10 neurons. Partitions that can be run consecutively on hardware are marked. The four partitions in the first layer are interchangeable. (B) Data flow of the model from (A) using five partitions, where the additional need to record and play back events to/from the host computer in-between layers is visualized by dashed lines. (C) Measured spikes and membrane potentials of each hardware run. To run the fifth partition, the spikes from the first four partitions need to be known. On a multi-chip setup with at least five chips, all parts could be run in parallel.
Figure 3: (A) (left) Example image of the EuroSAT dataset. (middle) The image encoded. (B) Partitioning and placement of the network used to classify the EuroSAT dataset. The basic synapse and neuron layout of the asic is shown in each column: in the center, two rows of neuron circuits are located; each neuron row is fed from the adjacent synapse array (top/bottom rectangles). Neuron circuits can be combined to form larger logical neurons, supporting larger fan in. On the left of each hardware instance, the source and size of the fan-in are indicated. Each neuron in the first hidden layer has a receptive field of $3 \times 3 \times 3$ and can be mapped to one instance. To reduce the number of input spikes, we run it in multiple parts (indicated by the red box). The neurons in the second layer consist of four connected neuron circuits. This layer, as well as the readout layer, is executed in a single run each.
Figure 4: Accuracy (left) and loss (right) of the model on the EuroSAT dataset in simulation and/or on . The solid lines correspond to the training set, the dotted to the validation set. Blue corresponds to a fully simulated network, green to the whole partitioned emulated on , and orange to mixed simulation/ execution with only the first layer being emulated on .

Scalable Network Emulation on Analog Neuromorphic Hardware

TL;DR

Abstract

Scalable Network Emulation on Analog Neuromorphic Hardware

Authors

TL;DR

Abstract

Table of Contents

Figures (4)