Exploiting heterogeneous delays for efficient computation in low-bit neural networks
Pengfei Sun, Jascha Achterberg, Zhe Su, Dan F. M. Goodman, Danyal Akarca
TL;DR
The paper investigates whether delay heterogeneity can be learned as a computational resource alongside synaptic weights in spiking neural networks. It demonstrates that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic tasks and allows dramatic weight-precision reductions, achieving similar accuracy with orders of magnitude fewer weight bits. Through ablations and analyses, the authors show that the importance of delay lengths is task-dependent and that delays offset weight quantization, with time constants shaping the delay distribution. They present a compact architecture that optimizes delays and weights together to minimize memory while preserving high accuracy, suggesting temporal structure as a key principle for energy-efficient embodied AI and neuromorphic hardware.
Abstract
Neural networks rely on learning synaptic weights. However, this overlooks other neural parameters that can also be learned and may be utilized by the brain. One such parameter is the delay: the brain exhibits complex temporal dynamics with heterogeneous delays, where signals are transmitted asynchronously between neurons. It has been theorized that this delay heterogeneity, rather than a cost to be minimized, can be exploited in embodied contexts where task-relevant information naturally sits contextually in the time domain. We test this hypothesis by training spiking neural networks to modify not only their weights but also their delays at different levels of precision. We find that delay heterogeneity enables state-of-the-art performance on temporally complex neuromorphic problems and can be achieved even when weights are extremely imprecise (1.58-bit ternary precision: just positive, negative, or absent). By enabling high performance with extremely low-precision weights, delay heterogeneity allows memory-efficient solutions that maintain state-of-the-art accuracy even when weights are compressed over an order of magnitude more aggressively than typically studied weight-only networks. We show how delays and time-constants adaptively trade-off, and reveal through ablation that task performance depends on task-appropriate delay distributions, with temporally-complex tasks requiring longer delays. Our results suggest temporal heterogeneity is an important principle for efficient computation, particularly when task-relevant information is temporal - as in the physical world - with implications for embodied intelligent systems and neuromorphic hardware.
