Table of Contents
Fetching ...

Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation

Shubham Negi, Deepika Sharma, Adarsh Kumar Kosta, Kaushik Roy

TL;DR

A novel SNN-ANN hybrid architecture that combines the asynchronous compute capabilities of SNN layers to effectively extract the input temporal information and provides extensive experimental analysis for assigning each layer to be spiking or analog, leading to a network configuration optimized for performance and ease of training.

Abstract

In the field of robotics, event-based cameras are emerging as a promising low-power alternative to traditional frame-based cameras for capturing high-speed motion and high dynamic range scenes. This is due to their sparse and asynchronous event outputs. Spiking Neural Networks (SNNs) with their asynchronous event-driven compute, show great potential for extracting the spatio-temporal features from these event streams. In contrast, the standard Analog Neural Networks (ANNs) fail to process event data effectively. However, training SNNs is difficult due to additional trainable parameters (thresholds and leaks), vanishing spikes at deeper layers, and a non-differentiable binary activation function. Furthermore, an additional data structure, membrane potential, responsible for keeping track of temporal information, must be fetched and updated at every timestep in SNNs. To overcome these challenges, we propose a novel SNN-ANN hybrid architecture that combines the strengths of both. Specifically, we leverage the asynchronous compute capabilities of SNN layers to effectively extract the input temporal information. Concurrently, the ANN layers facilitate training and efficient hardware deployment on traditional machine learning hardware such as GPUs. We provide extensive experimental analysis for assigning each layer to be spiking or analog, leading to a network configuration optimized for performance and ease of training. We evaluate our hybrid architecture for optical flow estimation on DSEC-flow and Multi-Vehicle Stereo Event-Camera (MVSEC) datasets. On the DSEC-flow dataset, the hybrid SNN-ANN architecture achieves a 40% reduction in average endpoint error (AEE) with 22% lower energy consumption compared to Full-SNN, and 48% lower AEE compared to Full-ANN, while maintaining comparable energy usage.

Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation

TL;DR

A novel SNN-ANN hybrid architecture that combines the asynchronous compute capabilities of SNN layers to effectively extract the input temporal information and provides extensive experimental analysis for assigning each layer to be spiking or analog, leading to a network configuration optimized for performance and ease of training.

Abstract

In the field of robotics, event-based cameras are emerging as a promising low-power alternative to traditional frame-based cameras for capturing high-speed motion and high dynamic range scenes. This is due to their sparse and asynchronous event outputs. Spiking Neural Networks (SNNs) with their asynchronous event-driven compute, show great potential for extracting the spatio-temporal features from these event streams. In contrast, the standard Analog Neural Networks (ANNs) fail to process event data effectively. However, training SNNs is difficult due to additional trainable parameters (thresholds and leaks), vanishing spikes at deeper layers, and a non-differentiable binary activation function. Furthermore, an additional data structure, membrane potential, responsible for keeping track of temporal information, must be fetched and updated at every timestep in SNNs. To overcome these challenges, we propose a novel SNN-ANN hybrid architecture that combines the strengths of both. Specifically, we leverage the asynchronous compute capabilities of SNN layers to effectively extract the input temporal information. Concurrently, the ANN layers facilitate training and efficient hardware deployment on traditional machine learning hardware such as GPUs. We provide extensive experimental analysis for assigning each layer to be spiking or analog, leading to a network configuration optimized for performance and ease of training. We evaluate our hybrid architecture for optical flow estimation on DSEC-flow and Multi-Vehicle Stereo Event-Camera (MVSEC) datasets. On the DSEC-flow dataset, the hybrid SNN-ANN architecture achieves a 40% reduction in average endpoint error (AEE) with 22% lower energy consumption compared to Full-SNN, and 48% lower AEE compared to Full-ANN, while maintaining comparable energy usage.
Paper Structure (16 sections, 7 equations, 7 figures, 2 tables)

This paper contains 16 sections, 7 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: AEE (lower is better) vs Inference energy for different versions of EV-FlowNet evflow on DSEC-flow and MVSEC dataset deployed on Eyeriss hardware chen2016eyeriss. The proposed hybrid SNN-ANN shows lower AEE and energy compared to Full-ANN, SNN and RNNfn2 counterpart. The shaded blue region shows the preferred region.
  • Figure 2: Event stream between two grayscale images. The event streams are binned and fed to the networks directly.
  • Figure 3: Neuronal Dynamics of a LIF neuron.
  • Figure 4: Hybrid SNN-ANN architecture based on (a) EV-FlowNet evflow and (b) Fire-FlowNet fireflownet. The activation layers can be ReLU or LIF, depending on whether the neural network is an ANN, SNN or Hybrid SNN-ANN. Numbers below the layers show the output channels in the layer. In the input dimension, $T$ is the timestep and $2$ is the input channel because of positive and negative polarity. (c) Conv-RNN based recurrent layer hagenaars2021self
  • Figure 5: Ablation study results on DSEC-flow dataset for (a) number of spiking layers (b) spiking layer position (c) spiking layer position with 2 spiking layers in the model. L1, L2, R1, R2, L3 are the layers in Fire-FlowNet architecture from Fig. \ref{['architecture']}(b).
  • ...and 2 more figures