Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

Christian Holberg; Cristopher Salvi

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

Christian Holberg, Cristopher Salvi

TL;DR

This work addresses the challenge of gradient-based training for stochastic spiking neural networks whose dynamics and spike timings are both driven by rough, potentially discontinuous noise. It develops a rigorous framework using rough path theory to model SSNNs as Event SDEs and extends to Marcus RDEs, deriving exact, pathwise gradient formulas for both trajectories and spike times and enabling end-to-end autodifferentiation via a differentiable solver implemented in \\texttt{diffrax}. A novel Marcus signature kernel is introduced to define a loss on càdlàg paths, yielding a robust MMD objective that facilitates training SSNNs as generative models. The approach supports online gradient updates and provides practical tools for input and weight estimation, advancing the feasibility of training SSNNs with noise influencing spike timing and network dynamics, with implications for bioplausible learning and neuromorphic hardware.

Abstract

We introduce a mathematically rigorous framework based on rough path theory to model stochastic spiking neural networks (SSNNs) as stochastic differential equations with event discontinuities (Event SDEs) and driven by càdlàg rough paths. Our formalism is general enough to allow for potential jumps to be present both in the solution trajectories as well as in the driving noise. We then identify a set of sufficient conditions ensuring the existence of pathwise gradients of solution trajectories and event times with respect to the network's parameters and show how these gradients satisfy a recursive relation. Furthermore, we introduce a general-purpose loss function defined by means of a new class of signature kernels indexed on càdlàg rough paths and use it to train SSNNs as generative models. We provide an end-to-end autodifferentiable solver for Event SDEs and make its implementation available as part of the $\texttt{diffrax}$ library. Our framework is, to our knowledge, the first enabling gradient-based training of SSNNs with noise affecting both the spike timing and the network's dynamics.

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

TL;DR

Abstract

library. Our framework is, to our knowledge, the first enabling gradient-based training of SSNNs with noise affecting both the spike timing and the network's dynamics.

Paper Structure (31 sections, 11 theorems, 81 equations, 2 figures, 1 algorithm)

This paper contains 31 sections, 11 theorems, 81 equations, 2 figures, 1 algorithm.

Introduction
Contributions
Related work
Neural stochastic differential equations (NSDEs)
Training techniques for NSDEs
Backpropagation through NSDEs
Differential equations with events
Training techniques for SNNs
Stochastic spiking neural networks as Event SDEs
Stochastic spiking neural networks
Model definition
Backpropagation
Numerical solvers
Training stochastic spiking neural networks
A loss function based on signature kernels for càdlàg paths
...and 16 more sections

Key Result

Theorem 3.1

Under Assumptions ass: event_finite-ass: event_domain_trans and with $\mu\in\textnormal{Lip}^1$ and $\sigma\in\textnormal{Lip}^\gamma$ for $\gamma > 2$, there exists a unique solution $(y, (\tau_n)_{n=1}^N)$ to the Event SDE of Definition def: esde_sol.

Figures (2)

Figure 1: Test loss and $c$ estimate across four sample sizes and for two levels of noise $\sigma$. On the left: MAE for the three first average spike times on a hold out test set. On the right: estimated value of $c$ at the current step.
Figure 2: We estimate the synpatic weights $w$ across three different sample sizes using the signature kernel MMD truncated at depth 3 and stochastic gradient descent with a batch size of 128. On the left we report the loss on a hold out test set. On the right is the mean absolute error between the entries of the currently estimated weight matrix $\hat{w}_{step}$ and the true weight matrix $w_{true}$.

Theorems & Definitions (40)

Definition 3.1: Event SDE
Theorem 3.1: Theorem 5.2, krystul2005generalised
Theorem 3.2
Remark 3.1
Remark 3.2
Remark 3.3
Remark 3.4
Definition A.1: Marcus interpolation
Definition A.2: Marcus RDE
Definition A.3
...and 30 more

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

TL;DR

Abstract

Exact Gradients for Stochastic Spiking Neural Networks Driven by Rough Signals

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (40)