Generalized Maximum Likelihood Estimation for Perspective-n-Point Problem
Tian Zhan, Chunfeng Xu, Cheng Zhang, Ke Zhu
TL;DR
This work tackles the PnP problem under realistic anisotropic observation uncertainty by introducing GMLPnP, a generalized maximum likelihood solver that jointly estimates pose and observation covariance via an iterated GLS procedure and a determinant-criterion objective. The method is decoupled from the camera model, enabling application to omnidirectional cameras, and includes theoretical discussion on consistency and convergence. Empirical results on synthetic data and real datasets (TUM-RGBD, KITTI-360) show clear improvements in rotation and translation accuracy, especially under high noise, and demonstrate practical viability in UAV vision-based localization. The approach offers a principled way to incorporate anisotropic uncertainty into PnP, with demonstrated gains in accuracy and robustness for real-world, cross-domain visual localization tasks.
Abstract
The Perspective-n-Point (PnP) problem has been widely studied in the literature and applied in various vision-based pose estimation scenarios. However, existing methods ignore the anisotropy uncertainty of observations, as demonstrated in several real-world datasets in this paper. This oversight may lead to suboptimal and inaccurate estimation, particularly in the presence of noisy observations. To this end, we propose a generalized maximum likelihood PnP solver, named GMLPnP, that minimizes the determinant criterion by iterating the GLS procedure to estimate the pose and uncertainty simultaneously. Further, the proposed method is decoupled from the camera model. Results of synthetic and real experiments show that our method achieves better accuracy in common pose estimation scenarios, GMLPnP improves rotation/translation accuracy by 4.7%/2.0% on TUM-RGBD and 18.6%/18.4% on KITTI-360 dataset compared to the best baseline. It is more accurate under very noisy observations in a vision-based UAV localization task, outperforming the best baseline by 34.4% in translation estimation accuracy.
