Table of Contents
Fetching ...

Event-Stream Super Resolution using Sigma-Delta Neural Network

Waseem Shariff, Joe Lemley, Peter Corcoran

TL;DR

This work tackles the limited spatial resolution of neuromorphic event cameras by introducing an end-to-end Sigma-Delta Neural Network (SDNN) that fuses binary spikes with sigma-delta modulation to perform event-stream super-resolution. The method leverages temporal difference and integration (ΔT, ΣT) to learn spatio-temporal distributions while operating with sparse, continuous-time representations, aided by a convolutional encoder–decoder architecture and PSTH-based loss combining $${Loss}^{Temporal}$$ and $${Loss}^{Spatial}$$. Across N-MNIST, CIFAR10-DVS, ASL-DVS, and E-NFS, SDNN achieves superior RMSE and PSNR, dramatically improving computational efficiency with up to 17.04× higher event sparsity and 32.28× fewer synaptic operations than an equivalent ANN, and roughly 2× better performance than SNNs. The approach shows strong potential for real-time, energy-efficient event-based SR and downstream tasks like object recognition, with demonstrated gains on multiple benchmarks and clear directions for hardware deployment and further optimization.

Abstract

This study introduces a novel approach to enhance the spatial-temporal resolution of time-event pixels based on luminance changes captured by event cameras. These cameras present unique challenges due to their low resolution and the sparse, asynchronous nature of the data they collect. Current event super-resolution algorithms are not fully optimized for the distinct data structure produced by event cameras, resulting in inefficiencies in capturing the full dynamism and detail of visual scenes with improved computational complexity. To bridge this gap, our research proposes a method that integrates binary spikes with Sigma Delta Neural Networks (SDNNs), leveraging spatiotemporal constraint learning mechanism designed to simultaneously learn the spatial and temporal distributions of the event stream. The proposed network is evaluated using widely recognized benchmark datasets, including N-MNIST, CIFAR10-DVS, ASL-DVS, and Event-NFS. A comprehensive evaluation framework is employed, assessing both the accuracy, through root mean square error (RMSE), and the computational efficiency of our model. The findings demonstrate significant improvements over existing state-of-the-art methods, specifically, the proposed method outperforms state-of-the-art performance in computational efficiency, achieving a 17.04-fold improvement in event sparsity and a 32.28-fold increase in synaptic operation efficiency over traditional artificial neural networks, alongside a two-fold better performance over spiking neural networks.

Event-Stream Super Resolution using Sigma-Delta Neural Network

TL;DR

This work tackles the limited spatial resolution of neuromorphic event cameras by introducing an end-to-end Sigma-Delta Neural Network (SDNN) that fuses binary spikes with sigma-delta modulation to perform event-stream super-resolution. The method leverages temporal difference and integration (ΔT, ΣT) to learn spatio-temporal distributions while operating with sparse, continuous-time representations, aided by a convolutional encoder–decoder architecture and PSTH-based loss combining and . Across N-MNIST, CIFAR10-DVS, ASL-DVS, and E-NFS, SDNN achieves superior RMSE and PSNR, dramatically improving computational efficiency with up to 17.04× higher event sparsity and 32.28× fewer synaptic operations than an equivalent ANN, and roughly 2× better performance than SNNs. The approach shows strong potential for real-time, energy-efficient event-based SR and downstream tasks like object recognition, with demonstrated gains on multiple benchmarks and clear directions for hardware deployment and further optimization.

Abstract

This study introduces a novel approach to enhance the spatial-temporal resolution of time-event pixels based on luminance changes captured by event cameras. These cameras present unique challenges due to their low resolution and the sparse, asynchronous nature of the data they collect. Current event super-resolution algorithms are not fully optimized for the distinct data structure produced by event cameras, resulting in inefficiencies in capturing the full dynamism and detail of visual scenes with improved computational complexity. To bridge this gap, our research proposes a method that integrates binary spikes with Sigma Delta Neural Networks (SDNNs), leveraging spatiotemporal constraint learning mechanism designed to simultaneously learn the spatial and temporal distributions of the event stream. The proposed network is evaluated using widely recognized benchmark datasets, including N-MNIST, CIFAR10-DVS, ASL-DVS, and Event-NFS. A comprehensive evaluation framework is employed, assessing both the accuracy, through root mean square error (RMSE), and the computational efficiency of our model. The findings demonstrate significant improvements over existing state-of-the-art methods, specifically, the proposed method outperforms state-of-the-art performance in computational efficiency, achieving a 17.04-fold improvement in event sparsity and a 32.28-fold increase in synaptic operation efficiency over traditional artificial neural networks, alongside a two-fold better performance over spiking neural networks.
Paper Structure (21 sections, 6 equations, 4 figures, 5 tables)

This paper contains 21 sections, 6 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Overview of the proposed system, combining binary spike inputs with SDNNs' temporal difference mechanism to further enhancing event stream resolution.
  • Figure 2: A sigma-delta neural network processing event-signals dynamically lava.
  • Figure 3: Sigma-Delta Neural Network architecture for enhancing event-stream data: The input event-stream (left) is processed through the proposed neural network, which includes convolution and deconvolution layers, resulting in a predicted event-stream (right) with improved spatial and temporal resolution.
  • Figure 4: The qualitative analysis of the ASL-DVS dataset compares the performance of SNN snn approach with the proposed Sigma Delta Neural Network. Each row presents reconstructed image of input-LR, ground truth-HR, predicted SNN - HR and the proposed SDNN predicted HR, allowing for a direct comparison. (Best viewed in 2x Zoom).