Table of Contents
Fetching ...

Synchrony-Gated Plasticity with Dopamine Modulation for Spiking Neural Networks

Yuchen Tian, Samuel Tensingh, Jason Eshraghian, Nhan Duy Truong, Omid Kavehei

TL;DR

DA-SSDP introduces a scalable, dopamine-modulated local learning rule that leverages batch-level spike synchrony to guide updates in deep spiking transformers. By combining instantaneous co-firing, a Gaussian latency kernel, and a warm-up derived gate, it aligns local plasticity with the supervised objective without architectural changes. Empirical results show consistent accuracy gains across CIFAR-10/100 and ImageNet-1K, with modest training-time overhead and strong robustness, especially in mid-to-late representations. The method offers a practical path to integrating biologically inspired plasticity into large-scale SNNs and neuromorphic-friendly training pipelines.

Abstract

While surrogate backpropagation proves useful for training deep spiking neural networks (SNNs), incorporating biologically inspired local signals on a large scale remains challenging. This difficulty stems primarily from the high memory demands of maintaining accurate spike-timing logs and the potential for purely local plasticity adjustments to clash with the supervised learning goal. To effectively leverage local signals derived from spiking neuron dynamics, we introduce Dopamine-Modulated Spike-Synchrony-Dependent Plasticity (DA-SSDP), a synchrony-based rule that is sensitive to loss and brings a synchrony-based local learning signal to the model. DA-SSDP condenses spike patterns into a synchrony metric at the batch level. An initial brief warm-up phase assesses its relationship to the task loss and sets a fixed gate that subsequently adjusts the local update's magnitude. In cases where synchrony proves unrelated to the task, the gate settles at one, simplifying DA-SSDP to a basic two-factor synchrony mechanism that delivers minor weight adjustments driven by concurrent spike firing and a Gaussian latency function. These small weight updates are only added to the network`s deeper layers following the backpropagation phase, and our tests showed this simplified version did not degrade performance and sometimes gave a small accuracy boost, serving as a regularizer during training. The rule stores only binary spike indicators and first-spike latencies with a Gaussian kernel. Without altering the model structure or optimization routine, evaluations on benchmarks like CIFAR-10 (+0.42\%), CIFAR-100 (+0.99\%), CIFAR10-DVS (+0.1\%), and ImageNet-1K (+0.73\%) demonstrated consistent accuracy gains, accompanied by a minor increase in computational overhead. Our code is available at https://github.com/NeuroSyd/DA-SSDP.

Synchrony-Gated Plasticity with Dopamine Modulation for Spiking Neural Networks

TL;DR

DA-SSDP introduces a scalable, dopamine-modulated local learning rule that leverages batch-level spike synchrony to guide updates in deep spiking transformers. By combining instantaneous co-firing, a Gaussian latency kernel, and a warm-up derived gate, it aligns local plasticity with the supervised objective without architectural changes. Empirical results show consistent accuracy gains across CIFAR-10/100 and ImageNet-1K, with modest training-time overhead and strong robustness, especially in mid-to-late representations. The method offers a practical path to integrating biologically inspired plasticity into large-scale SNNs and neuromorphic-friendly training pipelines.

Abstract

While surrogate backpropagation proves useful for training deep spiking neural networks (SNNs), incorporating biologically inspired local signals on a large scale remains challenging. This difficulty stems primarily from the high memory demands of maintaining accurate spike-timing logs and the potential for purely local plasticity adjustments to clash with the supervised learning goal. To effectively leverage local signals derived from spiking neuron dynamics, we introduce Dopamine-Modulated Spike-Synchrony-Dependent Plasticity (DA-SSDP), a synchrony-based rule that is sensitive to loss and brings a synchrony-based local learning signal to the model. DA-SSDP condenses spike patterns into a synchrony metric at the batch level. An initial brief warm-up phase assesses its relationship to the task loss and sets a fixed gate that subsequently adjusts the local update's magnitude. In cases where synchrony proves unrelated to the task, the gate settles at one, simplifying DA-SSDP to a basic two-factor synchrony mechanism that delivers minor weight adjustments driven by concurrent spike firing and a Gaussian latency function. These small weight updates are only added to the network`s deeper layers following the backpropagation phase, and our tests showed this simplified version did not degrade performance and sometimes gave a small accuracy boost, serving as a regularizer during training. The rule stores only binary spike indicators and first-spike latencies with a Gaussian kernel. Without altering the model structure or optimization routine, evaluations on benchmarks like CIFAR-10 (+0.42\%), CIFAR-100 (+0.99\%), CIFAR10-DVS (+0.1\%), and ImageNet-1K (+0.73\%) demonstrated consistent accuracy gains, accompanied by a minor increase in computational overhead. Our code is available at https://github.com/NeuroSyd/DA-SSDP.

Paper Structure

This paper contains 61 sections, 29 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of DA-SSDP(a) Pre–post layer and binary activity indicators from the current mini-batch. (b) Synchrony gate $\lambda$ - A pre/post pair that co-fires in the sample sets $\lambda=1$ and is potentiated (red), no co-firing sets $\lambda=0$ and yields depression (cyan). The first-spike lag $|\Delta t|$ is passed through a Gaussian window $g=\exp\!(-\Delta t^{2}/(2\sigma^{2}))$, which scales the magnitude only and the rule is order-invariant (synchrony decides the sign, timing sets the strength). (c) Dopamine modulation - During a short warm-up, a single scalar slope is fit from the empirical synchrony–loss correlation and after warm-up, the slope is frozen. Thereafter, the gate depends only on batch synchrony and simply rescales the local SSDP update (solid blue: DA-SSDP; dashed: ungated SSDP).
  • Figure 2: DA-SSDP integration points in SpikingResformer - The DA-SSDP module is inserted at two locations (1) the $1{\times}1$ projection convolution of the last DSSA block in Stage 3 (2) the linear classifier following global average pooling. During training, each hook records pre-/post-spike activity and applies the DA-SSDP update after the warm-up phase, operating alongside standard back-propagation.
  • Figure 3: Spatial effect of DA-SSDP in the last DSSA stage (a) Baseline attention maps (averaged over heads; $t\!=\!1{\sim}4$) on the same test image. (b) Per-patch feature-gain $\Delta(u,v)$, computed as the absolute change of the post-projection activation between the DA-SSDP and baseline checkpoints and averaged over time and channels (brighter = larger change). Hotspots in $\Delta$ align with high-attention areas from (a), suggesting a re-weighting of already attended tokens rather than a wholesale redirection.
  • Figure 4: Temporal spike patterns shaped by DA-SSDP - Raster plots (top) and per–neuron spike counts (bottom) for the first 500 LIF/PLIF channels over $T=4$ time steps, evaluated on a representative CIFAR-100 sample. Baseline activity is dense and widely spread, with many neurons firing on several of the four time steps and a small subset reaching the maximum count of four spikes. After DA-SSDP training, a compact synchronous burst emerges around indices 30–120, whereas the majority of channels emit at most a single spike or remain silent, yielding a markedly sparser profile.
  • Figure 5: Batch-level synchrony boxplot Each dot is one mini-batch. For batch $b$, we mark a channel active if it fires at least once within the window $T$, and define the batch synchrony score $S_b$ as the fraction of channel pairs that spike in the same time bin. Relative to the vanilla baseline, DA-SSDP shifts the distribution upward and tightens its spread (median $3{\times}10^{-4}\to1{\times}10^{-2}$, $\approx$33$\times$). This pattern is consistent with the loss-aware gate $G_b$, which promotes task-aligned spike coincidences and suppresses asynchronous activity, leading to more coordinated and stable population activity.
  • ...and 2 more figures