Table of Contents
Fetching ...

P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks

Malyaban Bal, Abhronil Sengupta

TL;DR

This work tackles long-range dependency tasks with spiking neural networks by introducing P-SpikeSSM, a probabilistic spiking state-space model that uses an $n$-dimensional hidden state and stochastic spike generation. The architecture couples SpikeSampler for parallel spike generation with SpikeMixer and FuseClamp layers to enable robust inter-neuron communication and residual-like aggregation, achieving scalable deep SNNs. Through experiments on the Long Range Arena, psMNIST, and Speech Command datasets, the approach attains state-of-the-art performance among SNNs while exhibiting sparse spiking that yields energy efficiency; a detailed energy analysis supports substantial reductions in computation compared with non-spiking baselines. Overall, P-SpikeSSM demonstrates that probabilistic spiking dynamics can deliver competitive accuracy for long-context tasks with hardware-friendly parallelism, paving the way for deployment on neuromorphic hardware and edge devices.

Abstract

Spiking neural networks (SNNs) are posited as a computationally efficient and biologically plausible alternative to conventional neural architectures, with their core computational framework primarily using the leaky integrate-and-fire (LIF) neuron model. However, the limited hidden state representation of LIF neurons, characterized by a scalar membrane potential, and sequential spike generation process, poses challenges for effectively developing scalable spiking models to address long-range dependencies in sequence learning tasks. In this study, we develop a scalable probabilistic spiking learning framework for long-range dependency tasks leveraging the fundamentals of state space models. Unlike LIF neurons that rely on the deterministic Heaviside function for a sequential process of spike generation, we introduce a SpikeSampler layer that samples spikes stochastically based on an SSM-based neuronal model while allowing parallel computations. To address non-differentiability of the spiking operation and enable effective training, we also propose a surrogate function tailored for the stochastic nature of the SpikeSampler layer. To enhance inter-neuron communication, we introduce the SpikeMixer block, which integrates spikes from neuron populations in each layer. This is followed by a ClampFuse layer, incorporating a residual connection to capture complex dependencies, enabling scalability of the model. Our models attain state-of-the-art performance among SNN models across diverse long-range dependency tasks, encompassing the Long Range Arena benchmark, permuted sequential MNIST, and the Speech Command dataset and demonstrate sparse spiking pattern highlighting its computational efficiency.

P-SpikeSSM: Harnessing Probabilistic Spiking State Space Models for Long-Range Dependency Tasks

TL;DR

This work tackles long-range dependency tasks with spiking neural networks by introducing P-SpikeSSM, a probabilistic spiking state-space model that uses an -dimensional hidden state and stochastic spike generation. The architecture couples SpikeSampler for parallel spike generation with SpikeMixer and FuseClamp layers to enable robust inter-neuron communication and residual-like aggregation, achieving scalable deep SNNs. Through experiments on the Long Range Arena, psMNIST, and Speech Command datasets, the approach attains state-of-the-art performance among SNNs while exhibiting sparse spiking that yields energy efficiency; a detailed energy analysis supports substantial reductions in computation compared with non-spiking baselines. Overall, P-SpikeSSM demonstrates that probabilistic spiking dynamics can deliver competitive accuracy for long-context tasks with hardware-friendly parallelism, paving the way for deployment on neuromorphic hardware and edge devices.

Abstract

Spiking neural networks (SNNs) are posited as a computationally efficient and biologically plausible alternative to conventional neural architectures, with their core computational framework primarily using the leaky integrate-and-fire (LIF) neuron model. However, the limited hidden state representation of LIF neurons, characterized by a scalar membrane potential, and sequential spike generation process, poses challenges for effectively developing scalable spiking models to address long-range dependencies in sequence learning tasks. In this study, we develop a scalable probabilistic spiking learning framework for long-range dependency tasks leveraging the fundamentals of state space models. Unlike LIF neurons that rely on the deterministic Heaviside function for a sequential process of spike generation, we introduce a SpikeSampler layer that samples spikes stochastically based on an SSM-based neuronal model while allowing parallel computations. To address non-differentiability of the spiking operation and enable effective training, we also propose a surrogate function tailored for the stochastic nature of the SpikeSampler layer. To enhance inter-neuron communication, we introduce the SpikeMixer block, which integrates spikes from neuron populations in each layer. This is followed by a ClampFuse layer, incorporating a residual connection to capture complex dependencies, enabling scalability of the model. Our models attain state-of-the-art performance among SNN models across diverse long-range dependency tasks, encompassing the Long Range Arena benchmark, permuted sequential MNIST, and the Speech Command dataset and demonstrate sparse spiking pattern highlighting its computational efficiency.
Paper Structure (20 sections, 16 equations, 4 figures, 6 tables)

This paper contains 20 sections, 16 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: (a) High-level overview of the P-SpikeSSM-based spiking architecture for LRA tasks. (b) Graph depicting the sparsity of spiking events generated by a single P-SpikeSSM neuron over input sequence length (for ListOps dataset). (c) Graph showing the layer-wise active neuron ratio (i.e., the proportion of neurons generating spikes within a layer per time step) against operating time steps, for randomly sampled input from ListOps dataset. The layer-wise spiking behavior illustrates the model-wide sparsity in spiking activity, contributing to computational efficiency.
  • Figure 2: Computational flow of the LIF-based SSM model compared to the SpikeSampler-driven P-SpikeSSM neuronal model. Here, $L$ is the sequence length, $u[t]$ represents the membrane potential of LIF neuron and $s[t]$ denotes the spike output (either 1 or 0) at time $t$. Unlike the LIF-based approach, which is constrained by a sequential bottleneck, our probabilistic approach supports parallel processing.
  • Figure 3: Results obtained from the test set of the ps-MNIST dataset. This experiment utilizes two P-SpikeSSM neuronal layers, with each layer containing $N$ neurons, represented on the x-axis. The accuracy achieved is displayed on the y-axis.
  • Figure 4: Results obtained after passing randomly sampled inputs from ListOps dataset through our model. Figure consists of histogram representing the count of neurons associated with mean probability of spiking (averaged over sequence length $L$) and Kernel Density Estimation (KDE) plot of the data using an exponential kernel. Thus, over the entire sequence, majority of neurons ($\approx90\%$) have close to $0$ probability of spiking, signifying sparse spiking pattern.