Deep activity propagation via weight initialization in spiking neural networks
Aurora Micheli, Olaf Booij, Jan van Gemert, Nergis Tömen
TL;DR
The paper tackles the challenge of training deep spiking neural networks by deriving an SNN-aware weight initialization that preserves activity across layers. By performing a variance-flow analysis that accounts for the spike-threshold activation, the authors obtain a closed-form weight variance $Var[w_l] = \frac{1}{n_l P(u_{l-1} > \theta)}$ that keeps $Var[u_l]$ constant, enabling deep activity propagation. Empirical validation across up to 100 layers and multiple time steps shows that this initialization maintains spike propagation and outperforms standard ANN-based initializations, yielding faster convergence and higher accuracy on MNIST-family datasets and CIFAR-10, with robustness to network width and neuron hyperparameters. The approach is presented as dataset-agnostic and architecture-agnostic, offering practical benefits for deploying deep SNNs on real tasks, though it currently abstracts temporal leakage effects and could be extended to explicit temporal dynamics. Overall, the work provides a principled, theory-grounded initialization that significantly improves the trainability and efficiency of deep SNNs.
Abstract
Spiking Neural Networks (SNNs) and neuromorphic computing offer bio-inspired advantages such as sparsity and ultra-low power consumption, providing a promising alternative to conventional networks. However, training deep SNNs from scratch remains a challenge, as SNNs process and transmit information by quantizing the real-valued membrane potentials into binary spikes. This can lead to information loss and vanishing spikes in deeper layers, impeding effective training. While weight initialization is known to be critical for training deep neural networks, what constitutes an effective initial state for a deep SNN is not well-understood. Existing weight initialization methods designed for conventional networks (ANNs) are often applied to SNNs without accounting for their distinct computational properties. In this work we derive an optimal weight initialization method specifically tailored for SNNs, taking into account the quantization operation. We show theoretically that, unlike standard approaches, this method enables the propagation of activity in deep SNNs without loss of spikes. We demonstrate this behavior in numerical simulations of SNNs with up to 100 layers across multiple time steps. We present an in-depth analysis of the numerical conditions, regarding layer width and neuron hyperparameters, which are necessary to accurately apply our theoretical findings. Furthermore, our experiments on MNIST demonstrate higher accuracy and faster convergence when using the proposed weight initialization scheme. Finally, we show that the newly introduced weight initialization is robust against variations in several network and neuron hyperparameters.
