Table of Contents
Fetching ...

Real-time Motion Segmentation with Event-based Normal Flow

Sheng Zhong, Zhongyang Ren, Xiya Zhu, Dehao Yuan, Cornelia Fermuller, Yi Zhou

TL;DR

This work proposes a normal flow-based motion segmentation framework for event-based vision that significantly reduces the computational complexity and ensures real-time performance, achieving nearly a 800x speedup in comparison to the open-source state-of-the-art method.

Abstract

Event-based cameras are bio-inspired sensors with pixels that independently and asynchronously respond to brightness changes at microsecond resolution, offering the potential to handle visual tasks in challenging scenarios. However, due to the sparse information content in individual events, directly processing the raw event data to solve vision tasks is highly inefficient, which severely limits the applicability of state-of-the-art methods in real-time tasks, such as motion segmentation, a fundamental task for dynamic scene understanding. Incorporating normal flow as an intermediate representation to compress motion information from event clusters within a localized region provides a more effective solution. In this work, we propose a normal flow-based motion segmentation framework for event-based vision. Leveraging the dense normal flow directly learned from event neighborhoods as input, we formulate the motion segmentation task as an energy minimization problem solved via graph cuts, and optimize it iteratively with normal flow clustering and motion model fitting. By using a normal flow-based motion model initialization and fitting method, the proposed system is able to efficiently estimate the motion models of independently moving objects with only a limited number of candidate models, which significantly reduces the computational complexity and ensures real-time performance, achieving nearly a 800x speedup in comparison to the open-source state-of-the-art method. Extensive evaluations on multiple public datasets fully demonstrate the accuracy and efficiency of our framework.

Real-time Motion Segmentation with Event-based Normal Flow

TL;DR

This work proposes a normal flow-based motion segmentation framework for event-based vision that significantly reduces the computational complexity and ensures real-time performance, achieving nearly a 800x speedup in comparison to the open-source state-of-the-art method.

Abstract

Event-based cameras are bio-inspired sensors with pixels that independently and asynchronously respond to brightness changes at microsecond resolution, offering the potential to handle visual tasks in challenging scenarios. However, due to the sparse information content in individual events, directly processing the raw event data to solve vision tasks is highly inefficient, which severely limits the applicability of state-of-the-art methods in real-time tasks, such as motion segmentation, a fundamental task for dynamic scene understanding. Incorporating normal flow as an intermediate representation to compress motion information from event clusters within a localized region provides a more effective solution. In this work, we propose a normal flow-based motion segmentation framework for event-based vision. Leveraging the dense normal flow directly learned from event neighborhoods as input, we formulate the motion segmentation task as an energy minimization problem solved via graph cuts, and optimize it iteratively with normal flow clustering and motion model fitting. By using a normal flow-based motion model initialization and fitting method, the proposed system is able to efficiently estimate the motion models of independently moving objects with only a limited number of candidate models, which significantly reduces the computational complexity and ensures real-time performance, achieving nearly a 800x speedup in comparison to the open-source state-of-the-art method. Extensive evaluations on multiple public datasets fully demonstrate the accuracy and efficiency of our framework.
Paper Structure (18 sections, 13 equations, 6 figures, 4 tables)

This paper contains 18 sections, 13 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: The proposed system takes the normal flow generated by VecKM_Flow yuan2024learning as input and performs motion segmentation in real-time based on the normal flow constraint. (a) and (b) show the orientation and magnitude of the normal flow, respectively, with a circular color diagram in (a) indicating the angle-color correspondence. (c) presents the motion segmentation results of our system, with different colors representing distinct motion models. (d) compares the average runtime between our system and EMSGC zhou2021emsgc under identical setup. The mean operating frequency of each method are displayed above the corresponding box, with our system achieving a speedup of nearly 800$\times$ compared to EMSGC.
  • Figure 2: Flowchart of the proposed system. The proposed system comprises two independently operating modules. The data pre-processing module downsamples the input dense normal flow and constructs a spatial graph via Delaunay triangulation shewchuk2009general. The motion segmentation module iteratively alternates between normal flow clustering (Labeling) and motion model fitting to segment the normal flow associated with IMOs.
  • Figure 3: Procedure of motion prediction. (a) Segmentation result at $t-1$, with the black solid box indicating the system-generated region containing an IMO. (b) Normal flow input at $t$, with the blue dashed box indicating the predicted IMO region after motion prediction. Normal flow within this region is used to initialize a candidate motion model. (c) Segmentation result at $t$, where the IMO primarily resides within the predicted box.
  • Figure 4: Segmentation results on the EED datasetmitrokhin2018iros. Time runs from left to right. The ground truth bounding boxes are denoted by red rectangles. Since the boxes are manually annotated on the grayscale images, and the timestamps cannot be perfectly aligned with the segmentation results, offsets are witnessed, especially for fast-moving IMOs.
  • Figure 5: Segmentation results on the EVIMO datasetzhou2021emsgc, on sequences Table (rows 1-2) and Boxed (rows 3-4). Time runs from left to right. The grayscale images are used for visualization only. The color of each label is determined by the number of normal flow associated with it, meaning that the label color for the same IMO may vary at different time. The ground truth mask is not perfectly aligned with the IMOs in the grayscale image due to the presence of motion blur.
  • ...and 1 more figures