Enhanced Neuromorphic Semantic Segmentation Latency through Stream Event
D. Hareb, J. Martinet, B. Miramond
TL;DR
This work addresses real-time semantic segmentation for resource-constrained environments by leveraging event streams from neuromorphic cameras and a Spiking Neural Network. A dynamic, region-based strategy compares the mean event count between successive frames to a threshold $\theta$ and selectively processes regions with significant changes using a lightweight SegSNNnet, while reusing the keyframe segmentation for smoother regions. The SegSNNnet backbone, featuring Spike-Element-Wise blocks and LIF neurons trained with surrogate gradients, achieves low energy consumption and is suited for neuromorphic hardware such as Loihi and SPLEAT. On the DSEC-semantic dataset, the method delivers substantial throughput gains (up to $5\times$–$10\times$ FPS) with modest MIoU losses of about $2$–$3\%$, demonstrating a practical balance between latency, accuracy, and energy efficiency for dynamic, embedded perception tasks.
Abstract
Achieving optimal semantic segmentation with frame-based vision sensors poses significant challenges for real-time systems like UAVs and self-driving cars, which require rapid and precise processing. Traditional frame-based methods often struggle to balance latency, accuracy, and energy efficiency. To address these challenges, we leverage event streams from event-based cameras-bio-inspired sensors that trigger events in response to changes in the scene. Specifically, we analyze the number of events triggered between successive frames, with a high number indicating significant changes and a low number indicating minimal changes. We exploit this event information to solve the semantic segmentation task by employing a Spiking Neural Network (SNN), a bio-inspired computing paradigm known for its low energy consumption. Our experiments on the DSEC dataset show that our approach significantly reduces latency with only a limited drop in accuracy. Additionally, by using SNNs, we achieve low power consumption, making our method suitable for energy-constrained real-time applications. To the best of our knowledge, our approach is the first to effectively balance reduced latency, minimal accuracy loss, and energy efficiency using events stream to enhance semantic segmentation in dynamic and resource-limited environments.
