Lightweight Event-based Optical Flow Estimation via Iterative Deblurring
Yilun Wu, Federico Paredes-Vallés, Guido C. H. E. de Croon
TL;DR
Event-based optical flow methods are often bottlenecked by correlation-volume computations that incur high latency and memory usage. IDNet introduces a correlation-volume-free approach that estimates flow from continuous event traces using iterative deblurring with a ConvGRU backbone, offering two update schemes: ID (batch-wise iterations) and TID (time-stepped iterations). The method achieves near state-of-the-art accuracy on DSEC-Flow with far fewer parameters and memory, and enables real-time operation on embedded hardware with TID, while still maintaining strong performance at higher resolutions. This work demonstrates that iterative deblurring and temporal priors can yield highly efficient, scalable flow estimation suitable for resource-constrained robotic systems.
Abstract
Inspired by frame-based methods, state-of-the-art event-based optical flow networks rely on the explicit construction of correlation volumes, which are expensive to compute and store, rendering them unsuitable for robotic applications with limited compute and energy budget. Moreover, correlation volumes scale poorly with resolution, prohibiting them from estimating high-resolution flow. We observe that the spatiotemporally continuous traces of events provide a natural search direction for seeking pixel correspondences, obviating the need to rely on gradients of explicit correlation volumes as such search directions. We introduce IDNet (Iterative Deblurring Network), a lightweight yet high-performing event-based optical flow network directly estimating flow from event traces without using correlation volumes. We further propose two iterative update schemes: "ID" which iterates over the same batch of events, and "TID" which iterates over time with streaming events in an online fashion. Our top-performing ID model sets a new state of the art on DSEC benchmark. Meanwhile, the base ID model is competitive with prior arts while using 80% fewer parameters, consuming 20x less memory footprint and running 40% faster on the NVidia Jetson Xavier NX. Furthermore, the TID model is even more efficient offering an additional 5x faster inference speed and 8 ms ultra-low latency at the cost of only a 9% performance drop, making it the only model among current literature capable of real-time operation while maintaining decent performance.
