Table of Contents
Fetching ...

Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings

Ilyass Hammouamri, Ismail Khalfaoui-Hassani, Timothée Masquelier

TL;DR

Spiking neural networks rely on delays to detect temporal spike coincidences, but learning these delays alongside weights has been challenging. The authors recast delays as learnable positions within a 1D Gaussian-interpolated convolution (DCLS) and train both weights and delays end-to-end offline, progressively narrowing the Gaussian width $\sigma$ to produce discrete delays $d_{ij}^{(l)}$. This approach yields state-of-the-art accuracy on temporal benchmarks SHD, SSC, and GSC-35 while using far fewer parameters and avoiding recurrent connections, making it attractive for neuromorphic implementation. The results, supported by ablations, demonstrate the value of jointly optimizing delays and weights and the effectiveness of gradually shrinking $\sigma$ to balance long-range dependencies and precision.

Abstract

Spiking Neural Networks (SNNs) are a promising research direction for building power-efficient information processing systems, especially for temporal tasks such as speech recognition. In SNNs, delays refer to the time needed for one spike to travel from one neuron to another. These delays matter because they influence the spike arrival times, and it is well-known that spiking neurons respond more strongly to coincident input spikes. More formally, it has been shown theoretically that plastic delays greatly increase the expressivity in SNNs. Yet, efficient algorithms to learn these delays have been lacking. Here, we propose a new discrete-time algorithm that addresses this issue in deep feedforward SNNs using backpropagation, in an offline manner. To simulate delays between consecutive layers, we use 1D convolutions across time. The kernels contain only a few non-zero weights - one per synapse - whose positions correspond to the delays. These positions are learned together with the weights using the recently proposed Dilated Convolution with Learnable Spacings (DCLS). We evaluated our method on three datasets: the Spiking Heidelberg Dataset (SHD), the Spiking Speech Commands (SSC) and its non-spiking version Google Speech Commands v0.02 (GSC) benchmarks, which require detecting temporal patterns. We used feedforward SNNs with two or three hidden fully connected layers, and vanilla leaky integrate-and-fire neurons. We showed that fixed random delays help and that learning them helps even more. Furthermore, our method outperformed the state-of-the-art in the three datasets without using recurrent connections and with substantially fewer parameters. Our work demonstrates the potential of delay learning in developing accurate and precise models for temporal data processing. Our code is based on PyTorch / SpikingJelly and available at: https://github.com/Thvnvtos/SNN-delays

Learning Delays in Spiking Neural Networks using Dilated Convolutions with Learnable Spacings

TL;DR

Spiking neural networks rely on delays to detect temporal spike coincidences, but learning these delays alongside weights has been challenging. The authors recast delays as learnable positions within a 1D Gaussian-interpolated convolution (DCLS) and train both weights and delays end-to-end offline, progressively narrowing the Gaussian width to produce discrete delays . This approach yields state-of-the-art accuracy on temporal benchmarks SHD, SSC, and GSC-35 while using far fewer parameters and avoiding recurrent connections, making it attractive for neuromorphic implementation. The results, supported by ablations, demonstrate the value of jointly optimizing delays and weights and the effectiveness of gradually shrinking to balance long-range dependencies and precision.

Abstract

Spiking Neural Networks (SNNs) are a promising research direction for building power-efficient information processing systems, especially for temporal tasks such as speech recognition. In SNNs, delays refer to the time needed for one spike to travel from one neuron to another. These delays matter because they influence the spike arrival times, and it is well-known that spiking neurons respond more strongly to coincident input spikes. More formally, it has been shown theoretically that plastic delays greatly increase the expressivity in SNNs. Yet, efficient algorithms to learn these delays have been lacking. Here, we propose a new discrete-time algorithm that addresses this issue in deep feedforward SNNs using backpropagation, in an offline manner. To simulate delays between consecutive layers, we use 1D convolutions across time. The kernels contain only a few non-zero weights - one per synapse - whose positions correspond to the delays. These positions are learned together with the weights using the recently proposed Dilated Convolution with Learnable Spacings (DCLS). We evaluated our method on three datasets: the Spiking Heidelberg Dataset (SHD), the Spiking Speech Commands (SSC) and its non-spiking version Google Speech Commands v0.02 (GSC) benchmarks, which require detecting temporal patterns. We used feedforward SNNs with two or three hidden fully connected layers, and vanilla leaky integrate-and-fire neurons. We showed that fixed random delays help and that learning them helps even more. Furthermore, our method outperformed the state-of-the-art in the three datasets without using recurrent connections and with substantially fewer parameters. Our work demonstrates the potential of delay learning in developing accurate and precise models for temporal data processing. Our code is based on PyTorch / SpikingJelly and available at: https://github.com/Thvnvtos/SNN-delays
Paper Structure (14 sections, 11 equations, 5 figures, 2 tables)

This paper contains 14 sections, 11 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Coincidence detection: we consider two neurons $N_1$ and $N_2$ with the same positive synaptic weight values. $N_2$ has a delayed synaptic connection denoted $d_{21}$ of $8$ms, thus both spikes from spike train $S_1$ and $S_2$ will reach $N_2$ quasi-simultaneously. As a result, the membrane potential of $N_2$ will reach the threshold $\vartheta$ and $N_2$ will emit a spike. On the other hand, $N_1$ will not react to these same input spike trains.
  • Figure 2: Example of one neuron with 2 afferent synaptic connections, convolving $K1$ and $K2$ with the zero left-padded $S_1$ and $S_2$ is equivalent to following Equation \ref{['eq:Input_ff_delayd']}
  • Figure 3: This figure illustrates the evolution of the same delay kernels for an example of eight synaptic connections of one neuron throughout the training process. The x-axis corresponds to time, and each kernel is of size $T_d=25$. And the y-axis is the synapse id. (a) corresponds to the initial phase where the standard deviation of the Gaussian $\sigma$ is large ($\frac{T_d}{2}$), allowing to take into consideration long temporal dependencies. (b) corresponds to the intermediate phase, (c) is taken from the final phase where $\sigma$ is at its minimum value (0.5) and weight tuning is more emphasized. Finally, (d) represents the kernel after converting to the discrete form with rounded positions.
  • Figure 4: Barplots of test accuracies on SHD and SSC datasets for different models. With (a): fully connected layers (FC) and (b): sparse synaptic connections (S). Reducing the number of synaptic connections of each neuron to ten for both SHD and SSC.
  • Figure 5: Gaussian convolution kernels for $N$ synaptic connections. The Gaussians are centered on the delay positions, and the area under their curves corresponds to the synaptic weights $w_i$. On the right, we see the delayed spike trains after being convolved with the kernels. (the $-1$ was omitted for figure clarity).