Table of Contents
Fetching ...

Learning Efficient Meshflow and Optical Flow from Event Cameras

Xinglong Luo, Ao Luo, Kunming Luo, Zhengning Wang, Ping Tan, Bing Zeng, Shuaicheng Liu

TL;DR

This work tackles the lack of high-resolution datasets and methods for event-based meshflow estimation. It introduces the High-Resolution Event Meshflow (HREM) dataset and its multi-density extension HREM+, enabling robust supervision for meshflow and optical flow from events. The Efficient Event-based MeshFlow (EEMFlow) network delivers state-of-the-art meshflow estimation with 30x faster inference, and EEMFlow+ extends this to optical flow with a CDC module that preserves motion boundaries. An Adaptive Density Module (ADM) further improves generalization across datasets by adjusting event density, boosting meshflow by up to 8% and optical flow by up to 10%. Collectively, these contributions advance efficient, density-robust event-based motion estimation with practical impact for real-time vision tasks.

Abstract

In this paper, we explore the problem of event-based meshflow estimation, a novel task that involves predicting a spatially smooth sparse motion field from event cameras. To start, we review the state-of-the-art in event-based flow estimation, highlighting two key areas for further research: i) the lack of meshflow-specific event datasets and methods, and ii) the underexplored challenge of event data density. First, we generate a large-scale High-Resolution Event Meshflow (HREM) dataset, which showcases its superiority by encompassing the merits of high resolution at 1280x720, handling dynamic objects and complex motion patterns, and offering both optical flow and meshflow labels. These aspects have not been fully explored in previous works. Besides, we propose Efficient Event-based MeshFlow (EEMFlow) network, a lightweight model featuring a specially crafted encoder-decoder architecture to facilitate swift and accurate meshflow estimation. Furthermore, we upgrade EEMFlow network to support dense event optical flow, in which a Confidence-induced Detail Completion (CDC) module is proposed to preserve sharp motion boundaries. We conduct comprehensive experiments to show the exceptional performance and runtime efficiency (30x faster) of our EEMFlow model compared to the recent state-of-the-art flow method. As an extension, we expand HREM into HREM+, a multi-density event dataset contributing to a thorough study of the robustness of existing methods across data with varying densities, and propose an Adaptive Density Module (ADM) to adjust the density of input event data to a more optimal range, enhancing the model's generalization ability. We empirically demonstrate that ADM helps to significantly improve the performance of EEMFlow and EEMFlow+ by 8% and 10%, respectively. Code and dataset are released at https://github.com/boomluo02/EEMFlowPlus.

Learning Efficient Meshflow and Optical Flow from Event Cameras

TL;DR

This work tackles the lack of high-resolution datasets and methods for event-based meshflow estimation. It introduces the High-Resolution Event Meshflow (HREM) dataset and its multi-density extension HREM+, enabling robust supervision for meshflow and optical flow from events. The Efficient Event-based MeshFlow (EEMFlow) network delivers state-of-the-art meshflow estimation with 30x faster inference, and EEMFlow+ extends this to optical flow with a CDC module that preserves motion boundaries. An Adaptive Density Module (ADM) further improves generalization across datasets by adjusting event density, boosting meshflow by up to 8% and optical flow by up to 10%. Collectively, these contributions advance efficient, density-robust event-based motion estimation with practical impact for real-time vision tasks.

Abstract

In this paper, we explore the problem of event-based meshflow estimation, a novel task that involves predicting a spatially smooth sparse motion field from event cameras. To start, we review the state-of-the-art in event-based flow estimation, highlighting two key areas for further research: i) the lack of meshflow-specific event datasets and methods, and ii) the underexplored challenge of event data density. First, we generate a large-scale High-Resolution Event Meshflow (HREM) dataset, which showcases its superiority by encompassing the merits of high resolution at 1280x720, handling dynamic objects and complex motion patterns, and offering both optical flow and meshflow labels. These aspects have not been fully explored in previous works. Besides, we propose Efficient Event-based MeshFlow (EEMFlow) network, a lightweight model featuring a specially crafted encoder-decoder architecture to facilitate swift and accurate meshflow estimation. Furthermore, we upgrade EEMFlow network to support dense event optical flow, in which a Confidence-induced Detail Completion (CDC) module is proposed to preserve sharp motion boundaries. We conduct comprehensive experiments to show the exceptional performance and runtime efficiency (30x faster) of our EEMFlow model compared to the recent state-of-the-art flow method. As an extension, we expand HREM into HREM+, a multi-density event dataset contributing to a thorough study of the robustness of existing methods across data with varying densities, and propose an Adaptive Density Module (ADM) to adjust the density of input event data to a more optimal range, enhancing the model's generalization ability. We empirically demonstrate that ADM helps to significantly improve the performance of EEMFlow and EEMFlow+ by 8% and 10%, respectively. Code and dataset are released at https://github.com/boomluo02/EEMFlowPlus.

Paper Structure

This paper contains 40 sections, 16 equations, 21 figures, 10 tables.

Figures (21)

  • Figure 1: The line chart about the event data density for four real-world datasets, including 240C mueggler2017event, MVSEC zhu2018multivehicle, DSEC gehrig2021dsec and EventVOT wang2024event. These datasets are captured using four different event cameras, and the density ranges of the event data show little overlap and significant differences, while the resolutions also vary.
  • Figure 2: Comparison of computational overhead and accuracy metrics. The x-axis represents inference time, while the y-axis corresponds to the end-point error. The size of each circle indicates the number of model parameters. Lower values for all metrics are considered better.
  • Figure 3: Our data generation pipeline. We generate high-frame-rate video and dense optical flow from a given 3D scene and camera parameters. We then employ three event data simulators to generate events, selecting it as source data which has the highest contrast of IWEs. Finally, we rasterize and median filter the dense optical flow for meshflow as the ground truth.
  • Figure 4: The process of generating meshflow from dense optical flow. (a) Propagate the motion vector of each grid center to the grid vertices. (b) Apply median filter $f_1$ to multiple motion vectors of each vertex to select the most appropriate motion for that vertex. (c) Use median filtering $f_2$ to smooth the motion field in the mesh grid. For ease of visualization, we present the $8 \times 8$ grid mesh in this paper.
  • Figure 5: Illustration of two dynamic scenes from HREM+ dataset. The first column shows the reference image, while columns two to five visualize event data at different densities, with density increasing from left to right.
  • ...and 16 more figures