Camera Motion Estimation from RGB-D-Inertial Scene Flow
Samuel Cerezo, Javier Civera
TL;DR
This work introduces a tightly coupled RGB-D–inertial scene flow framework for camera motion estimation in rigid environments, leveraging pre-integrated IMU residuals and depth-based velocity constraints within a sliding-window optimization. By jointly minimizing visual and inertial residuals and employing marginalization to retain information from past frames, the method achieves higher accuracy and robustness than RGB-D-only approaches, as demonstrated on synthetic ICL-NUIM and real OpenLORIS-Scene data. The key contributions are the integration of inertial data into a dense RGB-D flow odometry formulation, the use of gravity direction on $S^2$ for stable state representation, and a practical marginalization strategy that preserves past information while keeping the optimization tractable. Overall, the approach provides improved camera motion estimates and IMU state tracking, with potential benefits for indoor robotics and AR applications where multi-sensor fusion enhances reliability.
Abstract
In this paper, we introduce a novel formulation for camera motion estimation that integrates RGB-D images and inertial data through scene flow. Our goal is to accurately estimate the camera motion in a rigid 3D environment, along with the state of the inertial measurement unit (IMU). Our proposed method offers the flexibility to operate as a multi-frame optimization or to marginalize older data, thus effectively utilizing past measurements. To assess the performance of our method, we conducted evaluations using both synthetic data from the ICL-NUIM dataset and real data sequences from the OpenLORIS-Scene dataset. Our results show that the fusion of these two sensors enhances the accuracy of camera motion estimation when compared to using only visual data.
