Table of Contents
Fetching ...

Iterative Event-based Motion Segmentation by Variational Contrast Maximization

Ryo Yamaki, Shintaro Shiba, Guillermo Gallego, Yoshimitsu Aoki

TL;DR

This work tackles motion segmentation for event cameras by introducing an iterative, variational extension of the Contrast Maximization framework. At each step, it estimates a dominant motion and classifies events via the per-event first variation of the CMax loss, recursively handling residual events to uncover multiple motions without heavy initialization. The approach yields sharp, motion-compensated edge-like images and achieves state-of-the-art moving-object detection on benchmarks, including a reported >30% IoU improvement, while remaining applicable to simple and real-world scenes. Although not real-time due to its iterative nature, the method broadens CMax’s applicability to multi-motion scenarios and noisy data, with demonstrated robustness and visualization via the Mean Variation Image.

Abstract

Event cameras provide rich signals that are suitable for motion estimation since they respond to changes in the scene. As any visual changes in the scene produce event data, it is paramount to classify the data into different motions (i.e., motion segmentation), which is useful for various tasks such as object detection and visual servoing. We propose an iterative motion segmentation method, by classifying events into background (e.g., dominant motion hypothesis) and foreground (independent motion residuals), thus extending the Contrast Maximization framework. Experimental results demonstrate that the proposed method successfully classifies event clusters both for public and self-recorded datasets, producing sharp, motion-compensated edge-like images. The proposed method achieves state-of-the-art accuracy on moving object detection benchmarks with an improvement of over 30%, and demonstrates its possibility of applying to more complex and noisy real-world scenes. We hope this work broadens the sensitivity of Contrast Maximization with respect to both motion parameters and input events, thus contributing to theoretical advancements in event-based motion segmentation estimation. https://github.com/aoki-media-lab/event_based_segmentation_vcmax

Iterative Event-based Motion Segmentation by Variational Contrast Maximization

TL;DR

This work tackles motion segmentation for event cameras by introducing an iterative, variational extension of the Contrast Maximization framework. At each step, it estimates a dominant motion and classifies events via the per-event first variation of the CMax loss, recursively handling residual events to uncover multiple motions without heavy initialization. The approach yields sharp, motion-compensated edge-like images and achieves state-of-the-art moving-object detection on benchmarks, including a reported >30% IoU improvement, while remaining applicable to simple and real-world scenes. Although not real-time due to its iterative nature, the method broadens CMax’s applicability to multi-motion scenarios and noisy data, with demonstrated robustness and visualization via the Mean Variation Image.

Abstract

Event cameras provide rich signals that are suitable for motion estimation since they respond to changes in the scene. As any visual changes in the scene produce event data, it is paramount to classify the data into different motions (i.e., motion segmentation), which is useful for various tasks such as object detection and visual servoing. We propose an iterative motion segmentation method, by classifying events into background (e.g., dominant motion hypothesis) and foreground (independent motion residuals), thus extending the Contrast Maximization framework. Experimental results demonstrate that the proposed method successfully classifies event clusters both for public and self-recorded datasets, producing sharp, motion-compensated edge-like images. The proposed method achieves state-of-the-art accuracy on moving object detection benchmarks with an improvement of over 30%, and demonstrates its possibility of applying to more complex and noisy real-world scenes. We hope this work broadens the sensitivity of Contrast Maximization with respect to both motion parameters and input events, thus contributing to theoretical advancements in event-based motion segmentation estimation. https://github.com/aoki-media-lab/event_based_segmentation_vcmax

Paper Structure

This paper contains 18 sections, 9 equations, 9 figures, 3 tables, 1 algorithm.

Figures (9)

  • Figure 1: Overview. The proposed method relies on event data, and achieves motion segmentation by iteratively running Contrast Maximization with respect to the parameters and with respect to the event data. The variation can be visualized as Mean Variation Image. As results, we obtain the segmented event stream and motion-compensated images (i.e., images of warped events). The scene (from Zhou21tnnls) shows two pedestrians walking in a corridor while the camera slightly moves.
  • Figure 2: Block diagram of the proposed method. First, we use Contrast Maximization to estimate the dominant motion on the scene. Using the estimated motion $\boldsymbol{\theta}$, we calculate the first variation of the contrast function with respect to the event coordinates \ref{['eq:variation']}. The magnitude of the first variation can be visualized as a heat map \ref{['eq:gradientMap']} (here, from yellow to blue): the higher it is, the greater the likelihood that the corresponding events do not conform to the estimated motion, i.e., they belong to an independent moving object (IMO). Thresholding the first variation classifies events into the "fit" events and "residual" events in terms of the currently estimated motion. The "fit" events are removed (defining a segmented object or "cluster") and the above steps are repeated on the residual events, until the final segmentation.
  • Figure 3: Iterative clustering on self-recorded data. (b)-(c): the mean variation images (MVIs) during two iterations show that the variation \ref{['eq:gradientMap']} becomes high (i.e., blue) for the remaining IMO (color bar in \ref{['fig:method']}). We use these heat map values to produce the segmentation results (d).
  • Figure 4: Segmentation results on our dataset. For benchmark, we compare with Zhou21tnnls and Stoffregen19iccv using different initialization strategies.
  • Figure 5: Motion segmentation and moving-object detection results on the EVIMO2 dataset. Instead of estimating the motion parameter $\boldsymbol{\theta}$, first we use the GT depth and IMU to obtain the warp for the static parts of the scene. As the MVI (c) clearly shows, the "residual" agree with the IMOs. Our segmentation results successfully detect the IMOs (e), compared with the baseline method Stoffregen19iccv.
  • ...and 4 more figures