Table of Contents
Fetching ...

StochEP: Stochastic Equilibrium Propagation for Spiking Convergent Recurrent Neural Networks

Jiaqi Lin, Yi Jiang, Abhronil Sengupta

TL;DR

This paper introduces Stochastic Equilibrium Propagation (StochEP), a framework that trains spiking neural networks with probabilistic spiking neurons inside EP to stabilize learning and enable deep, convolutional CRNNs. By proving a mean-field equivalence between the stochastic energy and the deterministic EP energy, StochEP inherits convergence guarantees while smoothing the optimization landscape through stochasticity. Empirically, StochEP achieves competitive performance against BPTT-trained SNNs and EP-trained non-spiking networks on MNIST, CIFAR-10, and DVS Gesture, while delivering substantial memory and energy savings and enabling processing of time-varying inputs. The work highlights stochasticity as both biologically plausible and practically advantageous for neuromorphic, on-chip learning, with clear directions for scaling and hardware evaluation.

Abstract

Spiking Neural Networks (SNNs) promise energy-efficient, sparse, biologically inspired computation. Training them with Backpropagation Through Time (BPTT) and surrogate gradients achieves strong performance but remains biologically implausible. Equilibrium Propagation (EP) provides a more local and biologically grounded alternative. However, existing EP frameworks, primarily based on deterministic neurons, either require complex mechanisms to handle discontinuities in spiking dynamics or fail to scale beyond simple visual tasks. Inspired by the stochastic nature of biological spiking mechanism and recent hardware trends, we propose a stochastic EP framework that integrates probabilistic spiking neurons into the EP paradigm. This formulation smoothens the optimization landscape, stabilizes training, and enables scalable learning in deep convolutional spiking convergent recurrent neural networks (CRNNs). We provide theoretical guarantees showing that the proposed stochastic EP dynamics approximate deterministic EP under mean-field theory, thereby inheriting its underlying theoretical guarantees. The proposed framework narrows the gap to both BPTT-trained SNNs and EP-trained non-spiking CRNNs in vision benchmarks while preserving locality, highlighting stochastic EP as a promising direction for neuromorphic and on-chip learning.

StochEP: Stochastic Equilibrium Propagation for Spiking Convergent Recurrent Neural Networks

TL;DR

This paper introduces Stochastic Equilibrium Propagation (StochEP), a framework that trains spiking neural networks with probabilistic spiking neurons inside EP to stabilize learning and enable deep, convolutional CRNNs. By proving a mean-field equivalence between the stochastic energy and the deterministic EP energy, StochEP inherits convergence guarantees while smoothing the optimization landscape through stochasticity. Empirically, StochEP achieves competitive performance against BPTT-trained SNNs and EP-trained non-spiking networks on MNIST, CIFAR-10, and DVS Gesture, while delivering substantial memory and energy savings and enabling processing of time-varying inputs. The work highlights stochasticity as both biologically plausible and practically advantageous for neuromorphic, on-chip learning, with clear directions for scaling and hardware evaluation.

Abstract

Spiking Neural Networks (SNNs) promise energy-efficient, sparse, biologically inspired computation. Training them with Backpropagation Through Time (BPTT) and surrogate gradients achieves strong performance but remains biologically implausible. Equilibrium Propagation (EP) provides a more local and biologically grounded alternative. However, existing EP frameworks, primarily based on deterministic neurons, either require complex mechanisms to handle discontinuities in spiking dynamics or fail to scale beyond simple visual tasks. Inspired by the stochastic nature of biological spiking mechanism and recent hardware trends, we propose a stochastic EP framework that integrates probabilistic spiking neurons into the EP paradigm. This formulation smoothens the optimization landscape, stabilizes training, and enables scalable learning in deep convolutional spiking convergent recurrent neural networks (CRNNs). We provide theoretical guarantees showing that the proposed stochastic EP dynamics approximate deterministic EP under mean-field theory, thereby inheriting its underlying theoretical guarantees. The proposed framework narrows the gap to both BPTT-trained SNNs and EP-trained non-spiking CRNNs in vision benchmarks while preserving locality, highlighting stochastic EP as a promising direction for neuromorphic and on-chip learning.

Paper Structure

This paper contains 32 sections, 2 theorems, 21 equations, 6 figures, 3 tables.

Key Result

Theorem 3.1

Suppose the energy function (Equation eq:base_engr and Equation eq:sto_engr) has symmetric weights. The mean firing rate satisfies: Under a mean-field independence assumption for sufficiently large networks sompolinsky1988chaosbuice2010systematic, we assume the units are independent across neuronal indices ($j\not=k$). Then, given that neuron states $\xi$ are deterministic, and $\sigma(\cdot) =

Figures (6)

  • Figure 1: Equilibrium Propagation (EP) optimizes neural networks through two phases. With inputs $x$ clamped, the network state $\xi$ relaxes to a fixed point $\xi^{*}$ after $T_{\mathrm{free}}$ time steps. A weak teaching signal then nudges only the output units $\xi_{\mathrm{out}}$ toward their target labels $\hat{y}$ by adding a small perturbation term to the dynamics, producing a nearby fixed point $\xi^{\beta}$ for a small $\beta>0$ after $T_{\mathrm{nudge}}$ time steps. Each synapse updates based on the contrast between its equilibrium states across the two phases. Circles with outlines represent unsaturated neurons, while filled circles denote saturated ones. Purple and green correspond to the free and nudge phases, respectively.
  • Figure 2: Illustration of stochastic spiking neuron dynamics. The membrane potential integrates weighted inputs from both forward $\mathbb{I}_f$ and backward passes $\mathbb{I}_b$ with decay factor $\lambda$, which is mapped to a firing probability scaled by factor $\kappa$, and generates spikes through Bernoulli sampling.
  • Figure 3: Comparison of membrane potential stability across different spiking neuron models. Heatmaps show changes in membrane potential over time for a network with one hidden layer of 512 neurons trained on 100 random MNIST samples. (a) The proposed stochastic model exhibits stable dynamics, rapidly converging to a smooth equilibrium. (b) A deterministic LIF model with a low-pass filter martin2021eqspike shows pronounced instability. (c) A deterministic LIF model stabilized using predictive coding and step-size scheduling o2019traininglin2024scaling achieves convergence. (d) A deterministic LIF model with only step-size scheduling, simplified from o2019traininglin2024scaling, exhibits fluctuations. The yellow vertical dashed line indicates the transition from the free phase to the nudge phase.
  • Figure 4: Spike raster plots of hidden-layer neurons trained on MNIST for different scaling factors $\kappa$. As $\kappa$ increases from 0.5 to 4.0, the firing density rises from 0.04 to 0.26, indicating more frequent spiking activity. Each plot shows neuron firing patterns during the free and nudge phases.
  • Figure 5: Summed magnitude of error signals at the output layer, averaged over 128 samples from the CIFAR-10 dataset using the 5C architecture. OD denotes the total number of output neurons (10, 100, and 1,000) corresponding to $N_{\rm perclass}=1, 10,$ and $100$, respectively. Increasing $N_{\rm perclass}$ amplifies the output-layer error signals, confirming stronger gradient propagation during training.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Theorem 3.1
  • Theorem 3.2
  • proof