Table of Contents
Fetching ...

Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras

Yipeng Lu, Yifan Zhao, Haiping Wang, Zhiwei Ruan, Yuan Liu, Zhen Dong, Bisheng Yang

TL;DR

Dashcam pose estimation is hindered by low-quality imagery and missing positioning sensors. We propose a motion-prior-based framework consisting of a motion-prior regression module that outputs a coarse relative pose $\mathbf{R}_c,\mathbf{t}_c$, an epipolar-line encoded matcher, and a RANSAC-style pose estimator with a motion-prior-aware scoring network. Trained on KITTI and evaluated on NuScenes and RealDashCam, the method yields approximately a $22\%$ improvement in pose accuracy at $AUC_{5^\circ}$ and enables about $19\%$ more images to be reconstructed with lower reprojection error in SfM. Ablation confirms the benefits of motion-prior regression, epipolar encoding, and prior-informed sampling, with a runtime showing practicality for map-production pipelines. Overall, the approach offers robust, actionably accurate dashcam pose estimates that can strengthen high-definition map updates and geo-information tasks.

Abstract

Dashboard cameras (dashcams) record millions of driving videos daily, offering a valuable potential data source for various applications, including driving map production and updates. A necessary step for utilizing these dashcam data involves the estimation of camera poses. However, the low-quality images captured by dashcams, characterized by motion blurs and dynamic objects, pose challenges for existing image-matching methods in accurately estimating camera poses. In this study, we propose a precise pose estimation method for dashcam images, leveraging the inherent camera motion prior. Typically, image sequences captured by dash cameras exhibit pronounced motion prior, such as forward movement or lateral turns, which serve as essential cues for correspondence estimation. Building upon this observation, we devise a pose regression module aimed at learning camera motion prior, subsequently integrating these prior into both correspondences and pose estimation processes. The experiment shows that, in real dashcams dataset, our method is 22% better than the baseline for pose estimation in AUC5\textdegree, and it can estimate poses for 19% more images with less reprojection error in Structure from Motion (SfM).

Exploiting Motion Prior for Accurate Pose Estimation of Dashboard Cameras

TL;DR

Dashcam pose estimation is hindered by low-quality imagery and missing positioning sensors. We propose a motion-prior-based framework consisting of a motion-prior regression module that outputs a coarse relative pose , an epipolar-line encoded matcher, and a RANSAC-style pose estimator with a motion-prior-aware scoring network. Trained on KITTI and evaluated on NuScenes and RealDashCam, the method yields approximately a improvement in pose accuracy at and enables about more images to be reconstructed with lower reprojection error in SfM. Ablation confirms the benefits of motion-prior regression, epipolar encoding, and prior-informed sampling, with a runtime showing practicality for map-production pipelines. Overall, the approach offers robust, actionably accurate dashcam pose estimates that can strengthen high-definition map updates and geo-information tasks.

Abstract

Dashboard cameras (dashcams) record millions of driving videos daily, offering a valuable potential data source for various applications, including driving map production and updates. A necessary step for utilizing these dashcam data involves the estimation of camera poses. However, the low-quality images captured by dashcams, characterized by motion blurs and dynamic objects, pose challenges for existing image-matching methods in accurately estimating camera poses. In this study, we propose a precise pose estimation method for dashcam images, leveraging the inherent camera motion prior. Typically, image sequences captured by dash cameras exhibit pronounced motion prior, such as forward movement or lateral turns, which serve as essential cues for correspondence estimation. Building upon this observation, we devise a pose regression module aimed at learning camera motion prior, subsequently integrating these prior into both correspondences and pose estimation processes. The experiment shows that, in real dashcams dataset, our method is 22% better than the baseline for pose estimation in AUC5\textdegree, and it can estimate poses for 19% more images with less reprojection error in Structure from Motion (SfM).
Paper Structure (21 sections, 6 equations, 9 figures, 4 tables)

This paper contains 21 sections, 6 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Dashcam images are often of low resolution with motion blur and dynamic objects, (a) which makes existing image matching methods struggle to estimate camera poses correctly. (b) In this paper, we propose to exploit the camera motion prior to restrict the correspondences to approximately conform to the coarsely estimated camera motions. (c) With the help of coarse camera motions, our method is able to accurately estimate correspondences, which thus results in accurate pose estimation. (updated the figure)
  • Figure 2: Overview of the proposed method. The motion prior regression module (1) first estimates a coarse relative pose of the input image pair by leveraging the motion prior. Then, the estimated coarse relative pose is incorporated into correspondence estimation (2) and pose estimation (3) to obtain a more accurate camera pose.(updated the figure)
  • Figure 3: The motion prior regression module estimates a coarse camera pose for the input image pair.(updated the figure)
  • Figure 4: Correspondence estimation with motion prior. The proposed Epipolar Line Encoding encodes the coarse camera pose to features extracted on the image pair. Then the features are fed to several interleaved cross- and self- attention layers for feature updating and correspondence estimation.(updated the figure)
  • Figure 5: Pose estimation with motion prior. The estimated motion prior and inlier distribution are utilized when scoring the camera pose hypotheses in RANSAC.(updated the figure)
  • ...and 4 more figures