Moving Object Detection from Moving Camera Using Focus of Expansion Likelihood and Segmentation
Masahiro Ogawa, Qi An, Atsushi Yamashita
TL;DR
This work tackles moving object detection from a moving camera by combining optical flow with segmentation through Focus of Expansion (FoE) analysis. FoELS probabilistically fuses a FoE-derived moving likelihood with panoptic segmentation priors, augmented with a flow-length term to handle parallel motion and a macroscopic object-level refinement to recover entire objects. The approach leverages UniMatch for dense flow and OneFormer for panoptic segmentation, with FoE computed via RANSAC and enriched by angular and length-based cues, achieving state-of-the-art IoU on DAVIS 2016 and FBMS-59 while remaining robust to camera motion, rotation, and zoom. Despite strong performance, FoELS is computationally intensive and temporally unstable, motivating future work on efficiency and tracking enhancements.
Abstract
Separating moving and static objects from a moving camera viewpoint is essential for 3D reconstruction, autonomous navigation, and scene understanding in robotics. Existing approaches often rely primarily on optical flow, which struggles to detect moving objects in complex, structured scenes involving camera motion. To address this limitation, we propose Focus of Expansion Likelihood and Segmentation (FoELS), a method based on the core idea of integrating both optical flow and texture information. FoELS computes the focus of expansion (FoE) from optical flow and derives an initial motion likelihood from the outliers of the FoE computation. This likelihood is then fused with a segmentation-based prior to estimate the final moving probability. The method effectively handles challenges including complex structured scenes, rotational camera motion, and parallel motion. Comprehensive evaluations on the DAVIS 2016 dataset and real-world traffic videos demonstrate its effectiveness and state-of-the-art performance.
