Table of Contents
Fetching ...

Using CSNNs to Perform Event-based Data Processing & Classification on ASL-DVS

Ria Patel, Sujit Tripathy, Zachary Sublett, Seoyoung An, Riya Patel

TL;DR

This work demonstrates a convolutional spiking neural network (CSNN) approach to processing event-based ASL-DVS data for gesture classification. It details a pipeline that handles neuromorphic data from AEDAT formats through DV Processing, culminating in a CSNN with three conv layers and a 24-neuron fully connected layer, trained with Adam and a surrogate-gradient-enabled backpropagation through time. The study reports 100% training accuracy on a subset and 81% validation accuracy, highlighting the potential of CSNNs for asynchronous, sparse event streams while acknowledging overfitting and the need for regularization. The results underscore the practicality of neuromorphic methods for hand-gesture recognition and point to avenues for improved generalization and regularization in future work.

Abstract

Recent advancements in bio-inspired visual sensing and neuromorphic computing have led to the development of various highly efficient bio-inspired solutions with real-world applications. One notable application integrates event-based cameras with spiking neural networks (SNNs) to process event-based sequences that are asynchronous and sparse, making them difficult to handle. In this project, we develop a convolutional spiking neural network (CSNN) architecture that leverages convolutional operations and recurrent properties of a spiking neuron to learn the spatial and temporal relations in the ASL-DVS gesture dataset. The ASL-DVS gesture dataset is a neuromorphic dataset containing hand gestures when displaying 24 letters (A to Y, excluding J and Z due to the nature of their symbols) from the American Sign Language (ASL). We performed classification on a pre-processed subset of the full ASL-DVS dataset to identify letter signs and achieved 100\% training accuracy. Specifically, this was achieved by training in the Google Cloud compute platform while using a learning rate of 0.0005, batch size of 25 (total of 20 batches), 200 iterations, and 10 epochs.

Using CSNNs to Perform Event-based Data Processing & Classification on ASL-DVS

TL;DR

This work demonstrates a convolutional spiking neural network (CSNN) approach to processing event-based ASL-DVS data for gesture classification. It details a pipeline that handles neuromorphic data from AEDAT formats through DV Processing, culminating in a CSNN with three conv layers and a 24-neuron fully connected layer, trained with Adam and a surrogate-gradient-enabled backpropagation through time. The study reports 100% training accuracy on a subset and 81% validation accuracy, highlighting the potential of CSNNs for asynchronous, sparse event streams while acknowledging overfitting and the need for regularization. The results underscore the practicality of neuromorphic methods for hand-gesture recognition and point to avenues for improved generalization and regularization in future work.

Abstract

Recent advancements in bio-inspired visual sensing and neuromorphic computing have led to the development of various highly efficient bio-inspired solutions with real-world applications. One notable application integrates event-based cameras with spiking neural networks (SNNs) to process event-based sequences that are asynchronous and sparse, making them difficult to handle. In this project, we develop a convolutional spiking neural network (CSNN) architecture that leverages convolutional operations and recurrent properties of a spiking neuron to learn the spatial and temporal relations in the ASL-DVS gesture dataset. The ASL-DVS gesture dataset is a neuromorphic dataset containing hand gestures when displaying 24 letters (A to Y, excluding J and Z due to the nature of their symbols) from the American Sign Language (ASL). We performed classification on a pre-processed subset of the full ASL-DVS dataset to identify letter signs and achieved 100\% training accuracy. Specifically, this was achieved by training in the Google Cloud compute platform while using a learning rate of 0.0005, batch size of 25 (total of 20 batches), 200 iterations, and 10 epochs.
Paper Structure (24 sections, 3 equations, 13 figures, 1 table)

This paper contains 24 sections, 3 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: LIF operation in SNN b8
  • Figure 2: Pipeline of our approach
  • Figure 3: Model of CSNN Network Architecture
  • Figure 4: Calculation of Size of CSNN Layers
  • Figure 5: ASL-DVS dataset components
  • ...and 8 more figures