Flow-Based Visual Stream Compression for Event Cameras
Daniel C. Stumpp, Himanshu Akolkar, Alan D. George, Ryad Benosman
TL;DR
This work tackles the challenge of high data rates from neuromorphic, event-based vision sensors by proposing Flow-Based Compression (FBC), which uses real-time optical-flow estimates to predict future events and reduce transmission needs in an asynchronous, stream-to-stream framework. The method achieves an average compression ratio of $CR\approx 2.81$ with about $68\%$ of events removed and a median temporal error of $0.48\ \mathrm{ms}$, while maintaining a low spatiotemporal distance of $3.07$ when reconstructions are compared to originals. Furthermore, cascading FBC with LZMA yields substantial gains (up to $CR\approx 29.16$ on some datasets), approaching state-of-the-art compression for non-real-time applications. Real-time prediction is demonstrated on both desktop and embedded platforms, supporting practical deployment in bandwidth- and power-constrained environments, with room for further improvements as optical-flow estimates improve.
Abstract
As the use of neuromorphic, event-based vision sensors expands, the need for compression of their output streams has increased. While their operational principle ensures event streams are spatially sparse, the high temporal resolution of the sensors can result in high data rates from the sensor depending on scene dynamics. For systems operating in communication-bandwidth-constrained and power-constrained environments, it is essential to compress these streams before transmitting them to a remote receiver. Therefore, we introduce a flow-based method for the real-time asynchronous compression of event streams as they are generated. This method leverages real-time optical flow estimates to predict future events without needing to transmit them, therefore, drastically reducing the amount of data transmitted. The flow-based compression introduced is evaluated using a variety of methods including spatiotemporal distance between event streams. The introduced method itself is shown to achieve an average compression ratio of 2.81 on a variety of event-camera datasets with the evaluation configuration used. That compression is achieved with a median temporal error of 0.48 ms and an average spatiotemporal event-stream distance of 3.07. When combined with LZMA compression for non-real-time applications, our method can achieve state-of-the-art average compression ratios ranging from 10.45 to 17.24. Additionally, we demonstrate that the proposed prediction algorithm is capable of performing real-time, low-latency event prediction.
