Table of Contents
Fetching ...

Spiking mode-based neural networks

Zhanghan Lin, Haiping Huang

TL;DR

SMNN introduces a Hopfield-inspired mode-decomposition of recurrent connections in spiking networks, enabling training in a low-dimensional mode-score space and projection of high-dimensional neural activity onto a few dominant modes. By factorizing the recurrent weight as $W^{\rm rec} = \bm{\xi}^{\rm in}\bm{\Sigma}(\bm{\xi}^{\rm out})^{\top}$ with $\bm{\Sigma}=\mathrm{diag}(\lambda_1,\dots,\lambda_P)$, the method reduces training complexity to $\mathcal{O}(2NP+P)$ and reveals attractor-like structures in a reduced mode space. The framework is validated on MNIST and a context-dependent sensory integration task, showing improved efficiency, interpretable low-dimensional dynamics, and robustness to pruning and ablation. These results suggest SMNN as a scalable, interpretable approach to spike-based computation with potential extensions to sparse, excitatory-inhibitory networks and deeper neuroscience insights into attractor dynamics.

Abstract

Spiking neural networks play an important role in brain-like neuromorphic computations and in studying working mechanisms of neural circuits. One drawback of training a large scale spiking neural network is that updating all weights is quite expensive. Furthermore, after training, all information related to the computational task is hidden into the weight matrix, prohibiting us from a transparent understanding of circuit mechanisms. Therefore, in this work, we address these challenges by proposing a spiking mode-based training protocol, where the recurrent weight matrix is explained as a Hopfield-like multiplication of three matrices: input, output modes and a score matrix. The first advantage is that the weight is interpreted by input and output modes and their associated scores characterizing the importance of each decomposition term. The number of modes is thus adjustable, allowing more degrees of freedom for modeling the experimental data. This significantly reduces the training cost because of significantly reduced space complexity for learning. Training spiking networks is thus carried out in the mode-score space. The second advantage is that one can project the high dimensional neural activity (filtered spike train) in the state space onto the mode space which is typically of a low dimension, e.g., a few modes are sufficient to capture the shape of the underlying neural manifolds. We successfully apply our framework in two computational tasks -- digit classification and selective sensory integration tasks. Our method accelerate the training of spiking neural networks by a Hopfield-like decomposition, and moreover this training leads to low-dimensional attractor structures of high-dimensional neural dynamics.

Spiking mode-based neural networks

TL;DR

SMNN introduces a Hopfield-inspired mode-decomposition of recurrent connections in spiking networks, enabling training in a low-dimensional mode-score space and projection of high-dimensional neural activity onto a few dominant modes. By factorizing the recurrent weight as with , the method reduces training complexity to and reveals attractor-like structures in a reduced mode space. The framework is validated on MNIST and a context-dependent sensory integration task, showing improved efficiency, interpretable low-dimensional dynamics, and robustness to pruning and ablation. These results suggest SMNN as a scalable, interpretable approach to spike-based computation with potential extensions to sparse, excitatory-inhibitory networks and deeper neuroscience insights into attractor dynamics.

Abstract

Spiking neural networks play an important role in brain-like neuromorphic computations and in studying working mechanisms of neural circuits. One drawback of training a large scale spiking neural network is that updating all weights is quite expensive. Furthermore, after training, all information related to the computational task is hidden into the weight matrix, prohibiting us from a transparent understanding of circuit mechanisms. Therefore, in this work, we address these challenges by proposing a spiking mode-based training protocol, where the recurrent weight matrix is explained as a Hopfield-like multiplication of three matrices: input, output modes and a score matrix. The first advantage is that the weight is interpreted by input and output modes and their associated scores characterizing the importance of each decomposition term. The number of modes is thus adjustable, allowing more degrees of freedom for modeling the experimental data. This significantly reduces the training cost because of significantly reduced space complexity for learning. Training spiking networks is thus carried out in the mode-score space. The second advantage is that one can project the high dimensional neural activity (filtered spike train) in the state space onto the mode space which is typically of a low dimension, e.g., a few modes are sufficient to capture the shape of the underlying neural manifolds. We successfully apply our framework in two computational tasks -- digit classification and selective sensory integration tasks. Our method accelerate the training of spiking neural networks by a Hopfield-like decomposition, and moreover this training leads to low-dimensional attractor structures of high-dimensional neural dynamics.
Paper Structure (21 sections, 32 equations, 10 figures, 3 tables)

This paper contains 21 sections, 32 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Model structures for MNIST and contextual integration tasks. (a) Model structure for MNIST task. Each image is converted to a spiking activity input to the recurrent reservoir, by Poisson spiking neurons whose rate is determined by the pixel intensity (see details in the main text). $T=20\, {\rm{ms}}$ in our learning setting. (b) The activity profile of one readout unit for the MNIST task. Only the maximal value is taken for classification. (c) Model structure for contextual integration task. If the cued (context 1 or context 2) input signal is generated using a positive offset value, then the network is supervised to produce an output approaching $+1$ regardless of the irrelevant input signals (e.g., those coming from the other context).
  • Figure 2: Learning performance of MNIST classification task. Five independent runs are used to estimate the standard deviation. (a) Test accuracy versus different mode size. Spiking model without mode decomposition learning (this counterpart is called SNN) is compared. SMNN indicates the mode-based learning of spiking networks. (b) Loss function as a function of training mini-batch. Each mini-batch is composed of $100$ digit images. The mode size varies, and the network size $N=100$. (c) Comparison of the accuracies between rate and spiking models with the same fixed network size $N$=200. Rate networks with mode-decomposition learning (MDL RNN) and without MDL (RNN) are also considered. (d) Membrane potential traces for three typical reservoir neurons in response to spike train inputs after training. The triangle or square marks when the filtered spike train takes a maximum value. $(P,N)=(3,100)$. Note that the displayed three moments for the maximal firing rates overlap with each other, but this is not the case for other neurons. (e) Spike trains of reservoir neurons with neuron firing rate (right) and population firing fraction (bottom) in response to an input image of digit $2$. $(P,N)=(3,100)$.
  • Figure 3: Learning performance of contextual integration task. Five independent runs are used to estimate the standard deviation. (a) Mean squared error (MSE) versus different mode sizes. Spiking model without the mode decomposition learning (this counterpart is called SNN) is compared. SMNN indicates the mode-based learning. (b) MSE as a function of training mini-batch for different mode sizes. The network size $N$=100. (c) Average output activity in response to test inputs for $(P,N) = (3,100)$. The shaded region indicates the stimulus period. Sensory inputs are only shown during the stimulus period, followed by a response period. Before the response period, the target output is always set to zero. The shaded region indicates the stimulus period. The fluctuation over $100$ random trials is also shown. Colored lines are two target outputs. (d) Membrane potential trace for five typical reservoir neurons in response to a random input. (e) Spike raster of reservoir neurons with neuron firing rate (right) and population firing fraction (bottom). $(P,N) = (3,100)$ for both (d) and (e).
  • Figure 4: Connectivity importance $|\lambda_\mu|$ versus rank (in descending order). Both MNIST classification and contextual integration task are considered, and the simulation conditions are the same with that in Figure \ref{['digit']} for MNIST and that in Figure \ref{['ctx']} for context dependent computation. We fix $P=30$. Three independent runs are used to estimate the standard deviation. We also define a more precise measure $\tau_\mu=\chi\Vert\bm{\xi}^{\rm in}_\mu\Vert_2 + \chi\Vert\bm{\xi}^{\rm out}_\mu\Vert_2 + |\lambda_\mu|$, where $\chi=\sum_\mu |\lambda_\mu|/\sum_\mu(\Vert\bm{\xi}^{\rm in}_\mu\Vert_2+\Vert\bm{\xi}^{\rm out}_\mu\Vert_2)$. (a,b) MNIST classification. (c,d) Contextual integration task. There appears the piecewise power law behavior for the $\tau$-measure in the log-log plot (b,d). The colored vertical dashed lines mark the fitting ranges for different network sizes.
  • Figure 5: The filtered spike train of the hidden layer projected to the mode space for the contextual integration task. The mode size $P=3$, and the network size $N = 100$. (a) Projection into the input mode space. Three hundred randomly generated trials were used. Different colors encode different offset signs and contextual cues. The color gets darker with time in the dynamics trajectory. (b) Context switching experiment. At $t = 30\,\rm{ms}$, the previous contextual cue is shifted to the other one. The left-y axis encodes the input signals, while the right-y axis encodes the contextual information. (c) Activity projection for the context switching experiment in (b). (d) Projection coefficient in the input mode space for the context switching experiment in (b).
  • ...and 5 more figures