Table of Contents
Fetching ...

Training a multilayer dynamical spintronic network with standard machine learning tools to perform time series classification

Erwan Plouet, Dédalo Sanz-Hernández, Aymeric Vecchiola, Julie Grollier, Frank Mizrahi

TL;DR

This work addresses energy-efficient time-series processing by implementing a recurrent neural network in hardware using spintronic oscillators as dynamical neurons. The authors build and train a multilayer spintronic network via backpropagation through time (BPTT) using standard ML tools, achieving time-series classification performance comparable to a software CTRNN on the sequential digits task. They derive design guidelines linking neuron relaxation time and input time scales, demonstrate robustness across a fivefold range of time scales, and explore how sparsity affects performance, estimating low-energy operation (around 40 pJ per image) for a modest network. Overall, the results validate spintronic dynamical networks as scalable, energy-efficient candidates for real-time time-series processing and provide practical guidance for hardware-aware training and design.

Abstract

The ability to process time-series at low energy cost is critical for many applications. Recurrent neural network, which can perform such tasks, are computationally expensive when implementing in software on conventional computers. Here we propose to implement a recurrent neural network in hardware using spintronic oscillators as dynamical neurons. Using numerical simulations, we build a multi-layer network and demonstrate that we can use backpropagation through time (BPTT) and standard machine learning tools to train this network. Leveraging the transient dynamics of the spintronic oscillators, we solve the sequential digits classification task with $89.83\pm2.91~\%$ accuracy, as good as the equivalent software network. We devise guidelines on how to choose the time constant of the oscillators as well as hyper-parameters of the network to adapt to different input time scales.

Training a multilayer dynamical spintronic network with standard machine learning tools to perform time series classification

TL;DR

This work addresses energy-efficient time-series processing by implementing a recurrent neural network in hardware using spintronic oscillators as dynamical neurons. The authors build and train a multilayer spintronic network via backpropagation through time (BPTT) using standard ML tools, achieving time-series classification performance comparable to a software CTRNN on the sequential digits task. They derive design guidelines linking neuron relaxation time and input time scales, demonstrate robustness across a fivefold range of time scales, and explore how sparsity affects performance, estimating low-energy operation (around 40 pJ per image) for a modest network. Overall, the results validate spintronic dynamical networks as scalable, energy-efficient candidates for real-time time-series processing and provide practical guidance for hardware-aware training and design.

Abstract

The ability to process time-series at low energy cost is critical for many applications. Recurrent neural network, which can perform such tasks, are computationally expensive when implementing in software on conventional computers. Here we propose to implement a recurrent neural network in hardware using spintronic oscillators as dynamical neurons. Using numerical simulations, we build a multi-layer network and demonstrate that we can use backpropagation through time (BPTT) and standard machine learning tools to train this network. Leveraging the transient dynamics of the spintronic oscillators, we solve the sequential digits classification task with accuracy, as good as the equivalent software network. We devise guidelines on how to choose the time constant of the oscillators as well as hyper-parameters of the network to adapt to different input time scales.
Paper Structure (6 sections, 11 equations, 5 figures)

This paper contains 6 sections, 11 equations, 5 figures.

Figures (5)

  • Figure 1: Architecture of the network. (a) Schematic of the network architecture with three layers of 32 neurons each. The neurons are represented by blue circles, the interlayer connections ($W_{ext}$) by purple boxes, the intralayer connections ($W_{int}$) by green boxes, the amplification factors and SoftMax by white boxes, and the high pass filters by orange boxes. The dimensions of the connections are indicated in the corresponding boxes. At each time step the input is of size one. The output is of size 10. For simplicity, we do not represent the fixed and trainable biases on this figure. (b-c-d) Normalized output RF power versus time, emitted respectively by the neuron 0 from layer 0, the neuron 0 from layer 1 and the neuron 1 from layer 2 . The colors (blue, orange, green, red, purple, brown) correspond to different inputs from different classes (respectively 5,6,9,3,3,8).
  • Figure 2: Performance on the sequential digits classification task. (a) Schematic of the task: the images are scanned row by row and the pixel intensity in used as input to the network. The violet and pink curves represent the input values versus time for an image of label 0 and label 1 respectively. (b) Accuracy of the network versus the time interval between two input points. The red curve corresponds to the simulated spintronic network while the green curve corresponds to a standard continuous-time recurrent neural network (CTRNN). Each accuracy point corresponds to the median accuracy on a series of 10 random initialisations, the shaded areas correspond to the range between the $75^{th}$ and $25^{th}$ percentiles. (c-d-e) Normalized output versus time for respectively the neuron 3, the neuron 1 and the neuron 2, of layer 2. The colors (blue, orange, green, red, purple, brown) correspond to different inputs from different classes (respectively 5,6,9,3,3,8). Each of the three panels corresponds to a different time interval between input points, shown by red circles and red dashed lines in panel (b). The neuron dynamics is respectively too slow, adapted, and too fast for the input time scale.
  • Figure 3: Time scale adaptation. (a) Accuracy versus the time between two input points $\Delta t$ for three neuron relaxation times $\tau$ (colors). (b) Accuracy versus the time between two input points $\Delta t$ for three learning rates (colors). For (a) and (b), each point corresponds to the median accuracy on a series of 10 random initialisations, the shaded areas correspond to the range between the $75^{th}$ and $25^{th}$ percentiles. (c) Accuracy versus the cumulative drive, for a wide range of parameters. Each point corresponds to the median accuracy on a series of 10 random initialisations. The points in green (resp. orange) have a neuron relaxation time $\tau$ larger (resp. smaller) than the input time scale $\Delta t$. The vertical red line marks where the cumulative drive is equal to 1.
  • Figure 4: Effect of the density of connections. The red curve shows the accuracy versus the density of the intralayer connections ($W_{int}$), for an interlayer density of 1. The green curve shows the accuracy versus the density of the interlayer connections ($W_{ext}$), for an intralayer density of 1. The black curve shows the accuracy versus the density of all connections. Each point corresponds to the median accuracy on a series of 10 random initialisations, the shaded areas correspond to the range between the $75^{th}$ and $25^{th}$ percentiles.
  • Figure 5: Right: Evolution of the accuracy (in % of correctly classified inputs) versus the number of epochs. Left: Evolution of the loss versus the number of epochs. The train dataset is represented with solid lines and the test dataset with dashed lines. The colors blue, green and red correspond to global connection densities of 50%, 25% and 10% respectively.