Table of Contents
Fetching ...

Exploiting heterogeneous delays for efficient computation in low-bit neural networks

Pengfei Sun, Jascha Achterberg, Zhe Su, Dan F. M. Goodman, Danyal Akarca

TL;DR

The paper investigates whether delay heterogeneity can be learned as a computational resource alongside synaptic weights in spiking neural networks. It demonstrates that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic tasks and allows dramatic weight-precision reductions, achieving similar accuracy with orders of magnitude fewer weight bits. Through ablations and analyses, the authors show that the importance of delay lengths is task-dependent and that delays offset weight quantization, with time constants shaping the delay distribution. They present a compact architecture that optimizes delays and weights together to minimize memory while preserving high accuracy, suggesting temporal structure as a key principle for energy-efficient embodied AI and neuromorphic hardware.

Abstract

Neural networks rely on learning synaptic weights. However, this overlooks other neural parameters that can also be learned and may be utilized by the brain. One such parameter is the delay: the brain exhibits complex temporal dynamics with heterogeneous delays, where signals are transmitted asynchronously between neurons. It has been theorized that this delay heterogeneity, rather than a cost to be minimized, can be exploited in embodied contexts where task-relevant information naturally sits contextually in the time domain. We test this hypothesis by training spiking neural networks to modify not only their weights but also their delays at different levels of precision. We find that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic problems and can be achieved even when weights are extremely imprecise (1.58-bit ternary precision: just positive, negative, or absent). By enabling high performance with extremely low-precision weights, delay heterogeneity allows memory-efficient solutions that maintain state-of-the-art accuracy even when weights are compressed over an order of magnitude more aggressively than typically studied weight-only networks. We show how delays and time-constants adaptively trade-off, and reveal through ablation that task performance depends on task-appropriate delay distributions, with temporally-complex tasks requiring longer delays. Our results suggest temporal heterogeneity is an important principle for efficient computation, particularly when task-relevant information is temporal - as in the physical world - with implications for embodied intelligent systems and neuromorphic hardware.

Exploiting heterogeneous delays for efficient computation in low-bit neural networks

TL;DR

The paper investigates whether delay heterogeneity can be learned as a computational resource alongside synaptic weights in spiking neural networks. It demonstrates that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic tasks and allows dramatic weight-precision reductions, achieving similar accuracy with orders of magnitude fewer weight bits. Through ablations and analyses, the authors show that the importance of delay lengths is task-dependent and that delays offset weight quantization, with time constants shaping the delay distribution. They present a compact architecture that optimizes delays and weights together to minimize memory while preserving high accuracy, suggesting temporal structure as a key principle for energy-efficient embodied AI and neuromorphic hardware.

Abstract

Neural networks rely on learning synaptic weights. However, this overlooks other neural parameters that can also be learned and may be utilized by the brain. One such parameter is the delay: the brain exhibits complex temporal dynamics with heterogeneous delays, where signals are transmitted asynchronously between neurons. It has been theorized that this delay heterogeneity, rather than a cost to be minimized, can be exploited in embodied contexts where task-relevant information naturally sits contextually in the time domain. We test this hypothesis by training spiking neural networks to modify not only their weights but also their delays at different levels of precision. We find that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic problems and can be achieved even when weights are extremely imprecise (1.58-bit ternary precision: just positive, negative, or absent). By enabling high performance with extremely low-precision weights, delay heterogeneity allows memory-efficient solutions that maintain state-of-the-art accuracy even when weights are compressed over an order of magnitude more aggressively than typically studied weight-only networks. We show how delays and time-constants adaptively trade-off, and reveal through ablation that task performance depends on task-appropriate delay distributions, with temporally-complex tasks requiring longer delays. Our results suggest temporal heterogeneity is an important principle for efficient computation, particularly when task-relevant information is temporal - as in the physical world - with implications for embodied intelligent systems and neuromorphic hardware.

Paper Structure

This paper contains 9 sections, 16 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Training delays provide parameter-efficient performance improvements according to task temporal complexity.a. Each neuron maintains one learned delay parameter shared across all outgoing connections ($O(n)$ scaling), while synaptic weights require individual parameters for each connection ($O(n^2)$ scaling). Circles represent neurons, lines represent connections. b. Classification accuracy learning curve across four neuromorphic datasets comparing networks with weights only (dashed line) versus weights and axonal delays (solid line). Performance deviation represents standard error across $n=10$ independent runs. Delays consistently improve performance, with benefits varying by task temporal complexity. c. Per-parameter performance analysis shows the efficiency advantage of delays relative to weights. The y-axis shows accuracy improvement per additional parameter. Delays (solid) achieve significantly greater performance gains per parameter than equivalent weight additions (dashed) across all tasks. The size of the circles on the lines corresponds to the number of neurons.
  • Figure 2: Weight-delay trade-offs enable extreme model compression through complementary parameter rolesa. Bit-budget landscape showing classification accuracy as a function of weight bits (y-axis) and delay bits (x-axis) for SHD, b. SSC, and c. NTIDIGITS. For each dataset, the left sub-panel shows the memory footprint for the hidden layers showing how delay bits contribute a small fraction of the total. The landscapes indicate accuracy levels across the combinations of quantization levels, with yellow representing higher performance. Dashed contour lines show iso-performance boundaries. Black circles mark two operating points - solution I which is an accuracy-matched solution with markedly smaller bit budgets and solution II which shows peak accuracy. d. Parameter distribution histograms comparing high-efficiency (Solution I, top panels) versus high-precision(Solution II, bottom panels) approaches for SHD, e. SSC, and f. NTIDIGITS dataset. Left panels show synaptic weight distributions and right panels show axonal delay distributions across both first and second layers. g. Memory-performance Pareto frontier. Each point represents a different quantization configuration, with x-axis showing memory footprint (log scale) and y-axis showing classification accuracy. Arrows highlight state-of-the-art performance points achieved with 20$\times$, 20$\times$, 19$\times$ memory reductions for SHD (left), SSC (middle), and NTIDIGITS (right) respectively.
  • Figure 3: Small numbers of long delay connections dominate performance and can be partially traded-off with neural time constants and regularization.a. Weight connection-pruning ablation on SHD, SSC, NTIDIGITS and DVS Gesture. Test accuracy is plotted against the fraction of delays removed either Short$\rightarrow$Long (dark solid line) or Long$\rightarrow$Short (light dashed line). Performance deviation represents standard error across n = 5 independent runs. Pruning many neurons with short delays has a relatively little effect, whereas removing a small set of the longest delays causes a sharp drop in accuracy for the well-controlled datasets (SHD and NTIDIGITS). DVS Gesture shows minimal sensitivity for long delays, consistent with the lower temporal complexity of the task. b. Time-constant trade-off. With plastic delays, increasing the time constant ($\tau$) systematically shifts the learned delay distributions towards shorter values and reduces dispersion. A non-trivial long tail persists, especially in deeper layers, indicating that longer $\tau$ only partially substitutes for long-range delays. c. Relationship between delay and $L_{2}$ regularization (strength $\lambda$). As $\lambda$ increases, the mean delay magnitude and the long-delay fraction decrease while test accuracy remains high. Earlier layers decrease first, whereas later-layer long delays are preserved.
  • Figure 4: Delay learning, quantization, regularization, and time constants jointly optimize resource efficiency on SHD dataset. (Top) Test accuracy (%) with error bars representing 3$\sigma$. (Middle) Memory footprint (Mb) demonstrates parameter-efficient gains from delay heterogeneity. (Bottom) Learned axonal delay distributions per layer under distinct configurations. Starting from weight-only feedforward baselines, direct weight quantization reduces memory but degrades accuracy. Adding learned delays with regularization and longer time constants ($\tau$) restores high accuracy while achieving substantially lower memory footprint. The optimal solution combines 1.58-bit weight quantization, 5-step delay quantization, delay regularization, and fixed $\tau=2$, achieving 90% accuracy with only 0.174 Mbits and sparse long-delay distributions.
  • Figure :
  • ...and 7 more figures