Pushing the boundaries of event subsampling in event-based video classification using CNNs
Hesam Araghi, Jan van Gemert, Nergis Tomen
TL;DR
The paper addresses how to reduce the data burden of event cameras for CNN based video classification by studying random event subsampling and its impact on accuracy. It leverages the EST algorithm to convert events to 18 channel frames and trains a ResNet34 across multiple neuromorphic datasets, with epoch wise per epoch subsampling and averaging test results over 20 draws. Key findings include that event counts can be reduced by an order of magnitude with minimal accuracy loss, along with increased sensitivity to hyperparameters and observable gradient diversity in sparse regimes, for which a novel hyperparameter sensitivity metric is introduced. The work offers practical guidance for edge AI deployments, highlights limitations in certain datasets, and suggests future extensions to other architectures and sparsity mitigation approaches.
Abstract
Event cameras offer low-power visual sensing capabilities ideal for edge-device applications. However, their high event rate, driven by high temporal details, can be restrictive in terms of bandwidth and computational resources. In edge AI applications, determining the minimum amount of events for specific tasks can allow reducing the event rate to improve bandwidth, memory, and processing efficiency. In this paper, we study the effect of event subsampling on the accuracy of event data classification using convolutional neural network (CNN) models. Surprisingly, across various datasets, the number of events per video can be reduced by an order of magnitude with little drop in accuracy, revealing the extent to which we can push the boundaries in accuracy vs. event rate trade-off. Additionally, we also find that lower classification accuracy in high subsampling rates is not solely attributable to information loss due to the subsampling of the events, but that the training of CNNs can be challenging in highly subsampled scenarios, where the sensitivity to hyperparameters increases. We quantify training instability across multiple event-based classification datasets using a novel metric for evaluating the hyperparameter sensitivity of CNNs in different subsampling settings. Finally, we analyze the weight gradients of the network to gain insight into this instability.
