Using CSNNs to Perform Event-based Data Processing & Classification on ASL-DVS
Ria Patel, Sujit Tripathy, Zachary Sublett, Seoyoung An, Riya Patel
TL;DR
This work demonstrates a convolutional spiking neural network (CSNN) approach to processing event-based ASL-DVS data for gesture classification. It details a pipeline that handles neuromorphic data from AEDAT formats through DV Processing, culminating in a CSNN with three conv layers and a 24-neuron fully connected layer, trained with Adam and a surrogate-gradient-enabled backpropagation through time. The study reports 100% training accuracy on a subset and 81% validation accuracy, highlighting the potential of CSNNs for asynchronous, sparse event streams while acknowledging overfitting and the need for regularization. The results underscore the practicality of neuromorphic methods for hand-gesture recognition and point to avenues for improved generalization and regularization in future work.
Abstract
Recent advancements in bio-inspired visual sensing and neuromorphic computing have led to the development of various highly efficient bio-inspired solutions with real-world applications. One notable application integrates event-based cameras with spiking neural networks (SNNs) to process event-based sequences that are asynchronous and sparse, making them difficult to handle. In this project, we develop a convolutional spiking neural network (CSNN) architecture that leverages convolutional operations and recurrent properties of a spiking neuron to learn the spatial and temporal relations in the ASL-DVS gesture dataset. The ASL-DVS gesture dataset is a neuromorphic dataset containing hand gestures when displaying 24 letters (A to Y, excluding J and Z due to the nature of their symbols) from the American Sign Language (ASL). We performed classification on a pre-processed subset of the full ASL-DVS dataset to identify letter signs and achieved 100\% training accuracy. Specifically, this was achieved by training in the Google Cloud compute platform while using a learning rate of 0.0005, batch size of 25 (total of 20 batches), 200 iterations, and 10 epochs.
