Table of Contents
Fetching ...

Frozen Backpropagation: Relaxing Weight Symmetry in Temporally-Coded Deep Spiking Neural Networks

Gaspard Goupy, Pierre Tirilly, Ioan Marius Bilasco

TL;DR

This work tackles the weight transport bottleneck in dual-network, BP-based training of temporally coded deep SNNs on neuromorphic hardware. It introduces Frozen Backpropagation (fBP), which freezes the feedback path for a interval $\Phi$ to reduce transport, then realigns, and adds three partial transport schemes to further cut data movement. Empirical results on Fashion-MNIST, CIFAR-10, and CIFAR-100 show that fBP achieves accuracy comparable to BP while reducing transport by up to $10^3$–$10^4\times$, with modest accuracy losses depending on the strategy. The findings offer practical guidance for energy-efficient on-chip learning in neuromorphic hardware and establish a baseline for BP-based deep SNN training with relaxed weight symmetry.

Abstract

Direct training of Spiking Neural Networks (SNNs) on neuromorphic hardware can greatly reduce energy costs compared to GPU-based training. However, implementing Backpropagation (BP) on such hardware is challenging because forward and backward passes are typically performed by separate networks with distinct weights. To compute correct gradients, forward and feedback weights must remain symmetric during training, necessitating weight transport between the two networks. This symmetry requirement imposes hardware overhead and increases energy costs. To address this issue, we introduce Frozen Backpropagation (fBP), a BP-based training algorithm relaxing weight symmetry in settings with separate networks. fBP updates forward weights by computing gradients with periodically frozen feedback weights, reducing weight transports during training and minimizing synchronization overhead. To further improve transport efficiency, we propose three partial weight transport schemes of varying computational complexity, where only a subset of weights is transported at a time. We evaluate our methods on image recognition tasks and compare them to existing approaches addressing the weight symmetry requirement. Our results show that fBP outperforms these methods and achieves accuracy comparable to BP. With partial weight transport, fBP can substantially lower transport costs by 1,000x with an accuracy drop of only 0.5pp on CIFAR-10 and 1.1pp on CIFAR-100, or by up to 10,000x at the expense of moderated accuracy loss. This work provides insights for guiding the design of neuromorphic hardware incorporating BP-based on-chip learning.

Frozen Backpropagation: Relaxing Weight Symmetry in Temporally-Coded Deep Spiking Neural Networks

TL;DR

This work tackles the weight transport bottleneck in dual-network, BP-based training of temporally coded deep SNNs on neuromorphic hardware. It introduces Frozen Backpropagation (fBP), which freezes the feedback path for a interval to reduce transport, then realigns, and adds three partial transport schemes to further cut data movement. Empirical results on Fashion-MNIST, CIFAR-10, and CIFAR-100 show that fBP achieves accuracy comparable to BP while reducing transport by up to , with modest accuracy losses depending on the strategy. The findings offer practical guidance for energy-efficient on-chip learning in neuromorphic hardware and establish a baseline for BP-based deep SNN training with relaxed weight symmetry.

Abstract

Direct training of Spiking Neural Networks (SNNs) on neuromorphic hardware can greatly reduce energy costs compared to GPU-based training. However, implementing Backpropagation (BP) on such hardware is challenging because forward and backward passes are typically performed by separate networks with distinct weights. To compute correct gradients, forward and feedback weights must remain symmetric during training, necessitating weight transport between the two networks. This symmetry requirement imposes hardware overhead and increases energy costs. To address this issue, we introduce Frozen Backpropagation (fBP), a BP-based training algorithm relaxing weight symmetry in settings with separate networks. fBP updates forward weights by computing gradients with periodically frozen feedback weights, reducing weight transports during training and minimizing synchronization overhead. To further improve transport efficiency, we propose three partial weight transport schemes of varying computational complexity, where only a subset of weights is transported at a time. We evaluate our methods on image recognition tasks and compare them to existing approaches addressing the weight symmetry requirement. Our results show that fBP outperforms these methods and achieves accuracy comparable to BP. With partial weight transport, fBP can substantially lower transport costs by 1,000x with an accuracy drop of only 0.5pp on CIFAR-10 and 1.1pp on CIFAR-100, or by up to 10,000x at the expense of moderated accuracy loss. This work provides insights for guiding the design of neuromorphic hardware incorporating BP-based on-chip learning.

Paper Structure

This paper contains 38 sections, 10 equations, 6 figures, 1 table, 1 algorithm.

Figures (6)

  • Figure 1: -based training in a dual-network configuration, consisting of a forward and a feedback network with distinct, unidirectional synapses. The forward network uses weights $W$ to compute neuron spike times $t$. The feedback network uses weights $B$ to compute neuron errors $\delta$. Neuron errors act as feedback signals triggering weight updates only in the forward network. For correct gradient computation, weight transport is needed to maintain $B$ symmetric with $W$ during training.
  • Figure 2: Accuracy drop versus weight transport reduction factor of relative to , across different partial weight transport strategies and various number of frozen iterations $\Phi$. Each strategy is evaluated by varying its specific hyperparameter. The x-axis uses a logarithmic scale, and y-axis ranges are adapted for each dataset to improve visualization. Best seen in color.
  • Figure 3: Cosine similarity between true and actual weight changes during training on CIFAR-100. Best seen in color.
  • Figure B.1: Accuracy drop of various methods with frozen feedback weights relative to , evaluated on VGG-11 for varying number of iterations $\Phi$. The y-axis range of the CIFAR-100 plot is larger to accommodate greater accuracy drops. Best seen in color.
  • Figure B.2: Accuracy drop versus weight transport reduction factor of fbp relative to bp on VGG-11, for different partial weight transport strategies and varying number of iterations $\Phi$. The top x-axis is logarithmic, and the y-axis ranges differ for each dataset to enhance visualization. Best seen in color.
  • ...and 1 more figures