Table of Contents
Fetching ...

Reward-Modulated Local Learning in Spiking Encoders: Controlled Benchmarks with STDP and Hybrid Rate Readouts

Debjyoti Chakraborty

TL;DR

A descriptive 2x2 analysis shows reward-shaping effects can reverse sign across stabilization regimes, so reward-shaping conclusions should be reported jointly with normalization settings, so reward-shaping conclusions should be reported jointly with normalization settings.

Abstract

This paper presents a controlled empirical study of biologically motivated local learning for handwritten digit recognition. We evaluate an STDP-inspired competitive proxy and a practical hybrid benchmark built on the same spiking population encoder. The proxy is motivated by leaky integrate-and-fire E/I circuit models with three-factor delayed reward modulation. The hybrid update is local in pre x post rates but uses supervised labels and no timing-based credit assignment. On sklearn digits, fixed-seed evaluation shows classical pixel baselines from 98.06 to 98.22% accuracy, while local spike-based models reach 86.39 +/- 4.75% (hybrid default) and 87.17 +/- 3.74% (STDP-style competitive proxy). Ablations identify normalization and reward-shaping settings as the strongest observed levers, with a best hybrid ablation of 95.52 +/- 1.11%. A network-free synthetic temporal benchmark supports the same timing-versus-rate interpretation under matched local-update training. A descriptive 2x2 analysis further shows reward-shaping effects can reverse sign across stabilization regimes, so reward-shaping conclusions should be reported jointly with normalization settings.

Reward-Modulated Local Learning in Spiking Encoders: Controlled Benchmarks with STDP and Hybrid Rate Readouts

TL;DR

A descriptive 2x2 analysis shows reward-shaping effects can reverse sign across stabilization regimes, so reward-shaping conclusions should be reported jointly with normalization settings, so reward-shaping conclusions should be reported jointly with normalization settings.

Abstract

This paper presents a controlled empirical study of biologically motivated local learning for handwritten digit recognition. We evaluate an STDP-inspired competitive proxy and a practical hybrid benchmark built on the same spiking population encoder. The proxy is motivated by leaky integrate-and-fire E/I circuit models with three-factor delayed reward modulation. The hybrid update is local in pre x post rates but uses supervised labels and no timing-based credit assignment. On sklearn digits, fixed-seed evaluation shows classical pixel baselines from 98.06 to 98.22% accuracy, while local spike-based models reach 86.39 +/- 4.75% (hybrid default) and 87.17 +/- 3.74% (STDP-style competitive proxy). Ablations identify normalization and reward-shaping settings as the strongest observed levers, with a best hybrid ablation of 95.52 +/- 1.11%. A network-free synthetic temporal benchmark supports the same timing-versus-rate interpretation under matched local-update training. A descriptive 2x2 analysis further shows reward-shaping effects can reverse sign across stabilization regimes, so reward-shaping conclusions should be reported jointly with normalization settings.
Paper Structure (41 sections, 8 equations, 6 figures, 10 tables, 1 algorithm)

This paper contains 41 sections, 8 equations, 6 figures, 10 tables, 1 algorithm.

Figures (6)

  • Figure 1: Schematic of the shared spiking encoder with two evaluated branches: STDP-inspired competitive proxy pathway and practical local rate-readout benchmark pathway (evaluated via proxy abstraction; no circuit simulation).
  • Figure 2: Accuracy trajectories for the population-coded hybrid learner, overlaying default norm-on and norm-off settings (representative model seed 23, split seed 2026).
  • Figure 3: Seed-level diagnostic summary for normalization-heuristic and reward-shaping ablations (bars: mean$\pm$std; points: per-seed values for seeds {11,23,37,41,53,67,79,83,97}).
  • Figure 4: Normalization-schedule diagnostic: mean class-row norm trajectory across training epochs (mean$\pm$std over seeds).
  • Figure 5: Normalized confusion matrix on the held--out test set (representative model seed 23, split seed 2026).
  • ...and 1 more figures