Table of Contents
Fetching ...

TDE-3: An improved prior for optical flow computation in spiking neural networks

Matthew Yedutenko, Federico Paredes-Valles, Lyes Khacef, Guido C. H. E. De Croon

TL;DR

Problem: direction-selectivity of TDE-2 degrades in textured scenes, hindering robust optical flow in spiking neural networks. Approach: augment with an inhibitory triplet (TDE-3) to reset the gain and train the detector with Backpropagation Through Time and surrogate gradients to linearly map input velocity $v$ to output spike count or inter-spike interval (ISI). Contributions: (i) robust direction-selectivity at the single-detector level; (ii) a training framework for spike-count and ISI-based velocity coding, including a novel ISI-based training method; (iii) quantitative comparisons showing comparable angular precision ($\approx 20^{\circ}$) to model-based methods with 2–4x energy reduction due to fewer spikes; (iv) real-world data validation on translating boxes and rotating disk; (v) demonstration of real-world applicability in neuromorphic hardware. Significance: enables efficient, neuromorphic-compatible motion estimation with strong robustness to texture, noise, and spatial-frequency variations.

Abstract

Motion detection is a primary task required for robotic systems to perceive and navigate in their environment. Proposed in the literature bioinspired neuromorphic Time-Difference Encoder (TDE-2) combines event-based sensors and processors with spiking neural networks to provide real-time and energy-efficient motion detection through extracting temporal correlations between two points in space. However, on the algorithmic level, this design leads to loss of direction-selectivity of individual TDEs in textured environments. Here we propose an augmented 3-point TDE (TDE-3) with additional inhibitory input that makes TDE-3 direction-selectivity robust in textured environments. We developed a procedure to train the new TDE-3 using backpropagation through time and surrogate gradients to linearly map input velocities into an output spike count or an Inter-Spike Interval (ISI). Our work is the first instance of training a spiking neuron to have a specific ISI. Using synthetic data we compared training and inference with spike count and ISI with respect to changes in stimuli dynamic range, spatial frequency, and level of noise. ISI turns out to be more robust towards variation in spatial frequency, whereas the spike count is a more reliable training signal in the presence of noise. We performed the first in-depth quantitative investigation of optical flow coding with TDE and compared TDE-2 vs TDE-3 in terms of energy-efficiency and coding precision. Results show that on the network level both detectors show similar precision (20 degree angular error, 88% correlation with ground truth). Yet, due to the more robust direction-selectivity of individual TDEs, TDE-3 based network spike less and hence is more energy-efficient. Reported precision is on par with model-based methods but the spike-based processing of the TDEs provides allows more energy-efficient inference with neuromorphic hardware.

TDE-3: An improved prior for optical flow computation in spiking neural networks

TL;DR

Problem: direction-selectivity of TDE-2 degrades in textured scenes, hindering robust optical flow in spiking neural networks. Approach: augment with an inhibitory triplet (TDE-3) to reset the gain and train the detector with Backpropagation Through Time and surrogate gradients to linearly map input velocity to output spike count or inter-spike interval (ISI). Contributions: (i) robust direction-selectivity at the single-detector level; (ii) a training framework for spike-count and ISI-based velocity coding, including a novel ISI-based training method; (iii) quantitative comparisons showing comparable angular precision () to model-based methods with 2–4x energy reduction due to fewer spikes; (iv) real-world data validation on translating boxes and rotating disk; (v) demonstration of real-world applicability in neuromorphic hardware. Significance: enables efficient, neuromorphic-compatible motion estimation with strong robustness to texture, noise, and spatial-frequency variations.

Abstract

Motion detection is a primary task required for robotic systems to perceive and navigate in their environment. Proposed in the literature bioinspired neuromorphic Time-Difference Encoder (TDE-2) combines event-based sensors and processors with spiking neural networks to provide real-time and energy-efficient motion detection through extracting temporal correlations between two points in space. However, on the algorithmic level, this design leads to loss of direction-selectivity of individual TDEs in textured environments. Here we propose an augmented 3-point TDE (TDE-3) with additional inhibitory input that makes TDE-3 direction-selectivity robust in textured environments. We developed a procedure to train the new TDE-3 using backpropagation through time and surrogate gradients to linearly map input velocities into an output spike count or an Inter-Spike Interval (ISI). Our work is the first instance of training a spiking neuron to have a specific ISI. Using synthetic data we compared training and inference with spike count and ISI with respect to changes in stimuli dynamic range, spatial frequency, and level of noise. ISI turns out to be more robust towards variation in spatial frequency, whereas the spike count is a more reliable training signal in the presence of noise. We performed the first in-depth quantitative investigation of optical flow coding with TDE and compared TDE-2 vs TDE-3 in terms of energy-efficiency and coding precision. Results show that on the network level both detectors show similar precision (20 degree angular error, 88% correlation with ground truth). Yet, due to the more robust direction-selectivity of individual TDEs, TDE-3 based network spike less and hence is more energy-efficient. Reported precision is on par with model-based methods but the spike-based processing of the TDEs provides allows more energy-efficient inference with neuromorphic hardware.
Paper Structure (28 sections, 8 equations, 13 figures, 3 tables)

This paper contains 28 sections, 8 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: TDE-2 and TDE-3. A. Left-to-right tuned 2-point TDE. It has two compartments: the facilitator and the trigger. When a stimulus moves to the right the trigger is activated after the facilitator and the neuron fires. However, in the gain, there is residual activity. Thus, when multiple textures move left (or orthogonally) the detector loses direction-selectivity B. Left-to-right tuned 3-point TDE. It has three compartments. Input to the inhibitor resets the gain to zero and removes residual activity. Therefore, direction-selectivity is retained.
  • Figure 2: Augmented 3-point TDE retains direction-selectivity in a textured environment. A - Visual stimulus composed of vertical bars. The bars had three light intensity levels: white, grey, and black. The size of the texture along the motion axis was 80 pixels and 3 pixels along the orthogonal direction. We employed 5 velocities: 0.1 px/timestep, 0.2 px/timestep, 0.33 px/timestep, 0.5 px/timestep and 1 px/timestep. The motion direction (left-right, right-left, top-bottom, bottom-top) and velocity were randomly chosen for each stimulus example (2000 examples per testing round, 400 testing rounds). To vary the "amount" of texture in stimuli we randomly varied the fraction of the grey bars from 0% to 80%. B. TDE-2 and TDE-3 and their responses to stimuli moving in 4 cardinal directions. C. Direction-selectivity index (fraction of spikes fired upon stimulus motion in PD.
  • Figure 3: Training of the TDE-3:Wide dynamic range, low resolution, one moving edge (A)Loss function for spike count-based inference, (B) loss function for ISI-based inference, (C) comparison of the velocity tuning curves. Blue - training with spike count, inference with spike count during the test; Orange - training with spike count, inference with ISI during the test; Green - training with ISI, inference with ISI during the test; Red - training with ISI, inference with spike count during the test.
  • Figure 4: Training of the TDE-3: Narrow dynamic range, high resolution, one moving edge. Training of the TDE-3: wide dynamic range, low resolution. (A)Loss function for spike count-based inference, (B) loss function for ISI-based inference, and (C) comparison of the velocity tuning curves. Blue - training with spike count, inference with spike count during the test; Orange - training with spike count, inference with ISI during the test; Green - training with ISI, inference with ISI during the test; Red - training with ISI, inference with spike count during the test.
  • Figure 5: Training of the TDE-3: Robustness to variation in spatial frequency. Wide dynamic range, low resolution, two moving edges at randomly assigned distances. (A)Loss function for spike count-based inference, (B) Loss function for ISI-based inference. (C)Comparison of the velocity tuning curves. Blue - training with spike count, inference with spike count during the test; Orange - training with spike count, inference with ISI during the test; Green - training with ISI, inference with ISI during the test; Red - training with ISI, inference with spike count during the test. As there was no noise, all of the variation in TDE response (shading) originates from the variation in stimulus spatial frequency (spacing between the edges)
  • ...and 8 more figures