Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras
Yipeng Lu, Yifan Zhao, Haiping Wang, Zhiwei Ruan, Yuan Liu, Zhen Dong, Bisheng Yang
TL;DR
Dashcam pose estimation is hindered by low-quality imagery and missing positioning sensors. We propose a motion-prior-based framework consisting of a motion-prior regression module that outputs a coarse relative pose $\mathbf{R}_c,\mathbf{t}_c$, an epipolar-line encoded matcher, and a RANSAC-style pose estimator with a motion-prior-aware scoring network. Trained on KITTI and evaluated on NuScenes and RealDashCam, the method yields approximately a $22\%$ improvement in pose accuracy at $AUC_{5^\circ}$ and enables about $19\%$ more images to be reconstructed with lower reprojection error in SfM. Ablation confirms the benefits of motion-prior regression, epipolar encoding, and prior-informed sampling, with a runtime showing practicality for map-production pipelines. Overall, the approach offers robust, actionably accurate dashcam pose estimates that can strengthen high-definition map updates and geo-information tasks.
Abstract
Dashboard cameras (dashcams) record millions of driving videos daily, offering a valuable potential data source for various applications, including driving map production and updates. A necessary step for utilizing these dashcam data involves the estimation of camera poses. However, the low-quality images captured by dashcams, characterized by motion blurs and dynamic objects, pose challenges for existing image-matching methods in accurately estimating camera poses. In this study, we propose a precise pose estimation method for dashcam images, leveraging the inherent camera motion prior. Typically, image sequences captured by dash cameras exhibit pronounced motion prior, such as forward movement or lateral turns, which serve as essential cues for correspondence estimation. Building upon this observation, we devise a pose regression module aimed at learning camera motion prior, subsequently integrating these prior into both correspondences and pose estimation processes. The experiment shows that, in real dashcams dataset, our method is 22% better than the baseline for pose estimation in AUC5\textdegree, and it can estimate poses for 19% more images with less reprojection error in Structure from Motion (SfM).
