Table of Contents
Fetching ...

Generalized Maximum Likelihood Estimation for Perspective-n-Point Problem

Tian Zhan, Chunfeng Xu, Cheng Zhang, Ke Zhu

TL;DR

This work tackles the PnP problem under realistic anisotropic observation uncertainty by introducing GMLPnP, a generalized maximum likelihood solver that jointly estimates pose and observation covariance via an iterated GLS procedure and a determinant-criterion objective. The method is decoupled from the camera model, enabling application to omnidirectional cameras, and includes theoretical discussion on consistency and convergence. Empirical results on synthetic data and real datasets (TUM-RGBD, KITTI-360) show clear improvements in rotation and translation accuracy, especially under high noise, and demonstrate practical viability in UAV vision-based localization. The approach offers a principled way to incorporate anisotropic uncertainty into PnP, with demonstrated gains in accuracy and robustness for real-world, cross-domain visual localization tasks.

Abstract

The Perspective-n-Point (PnP) problem has been widely studied in the literature and applied in various vision-based pose estimation scenarios. However, existing methods ignore the anisotropy uncertainty of observations, as demonstrated in several real-world datasets in this paper. This oversight may lead to suboptimal and inaccurate estimation, particularly in the presence of noisy observations. To this end, we propose a generalized maximum likelihood PnP solver, named GMLPnP, that minimizes the determinant criterion by iterating the GLS procedure to estimate the pose and uncertainty simultaneously. Further, the proposed method is decoupled from the camera model. Results of synthetic and real experiments show that our method achieves better accuracy in common pose estimation scenarios, GMLPnP improves rotation/translation accuracy by 4.7%/2.0% on TUM-RGBD and 18.6%/18.4% on KITTI-360 dataset compared to the best baseline. It is more accurate under very noisy observations in a vision-based UAV localization task, outperforming the best baseline by 34.4% in translation estimation accuracy.

Generalized Maximum Likelihood Estimation for Perspective-n-Point Problem

TL;DR

This work tackles the PnP problem under realistic anisotropic observation uncertainty by introducing GMLPnP, a generalized maximum likelihood solver that jointly estimates pose and observation covariance via an iterated GLS procedure and a determinant-criterion objective. The method is decoupled from the camera model, enabling application to omnidirectional cameras, and includes theoretical discussion on consistency and convergence. Empirical results on synthetic data and real datasets (TUM-RGBD, KITTI-360) show clear improvements in rotation and translation accuracy, especially under high noise, and demonstrate practical viability in UAV vision-based localization. The approach offers a principled way to incorporate anisotropic uncertainty into PnP, with demonstrated gains in accuracy and robustness for real-world, cross-domain visual localization tasks.

Abstract

The Perspective-n-Point (PnP) problem has been widely studied in the literature and applied in various vision-based pose estimation scenarios. However, existing methods ignore the anisotropy uncertainty of observations, as demonstrated in several real-world datasets in this paper. This oversight may lead to suboptimal and inaccurate estimation, particularly in the presence of noisy observations. To this end, we propose a generalized maximum likelihood PnP solver, named GMLPnP, that minimizes the determinant criterion by iterating the GLS procedure to estimate the pose and uncertainty simultaneously. Further, the proposed method is decoupled from the camera model. Results of synthetic and real experiments show that our method achieves better accuracy in common pose estimation scenarios, GMLPnP improves rotation/translation accuracy by 4.7%/2.0% on TUM-RGBD and 18.6%/18.4% on KITTI-360 dataset compared to the best baseline. It is more accurate under very noisy observations in a vision-based UAV localization task, outperforming the best baseline by 34.4% in translation estimation accuracy.
Paper Structure (21 sections, 1 theorem, 12 equations, 10 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 1 theorem, 12 equations, 10 figures, 3 tables, 1 algorithm.

Key Result

Proposition 1

The maximum likelihood estimation of transformation $\mathbf{R,t}$ in object space is given by minimizing the error function where $\lVert\cdot\rVert_{\Sigma}$ is the Mahalanobis norm, and $\Sigma$ is the covariance matrix of the known noise distribution.

Figures (10)

  • Figure 1: We formulate the model in object space with projection rays, which can cope with perspective and omnidirectional camera models. The blue ellipse cloud visualizes the uncertainty.
  • Figure 2: Estimation error vs num points, the number of points is from $20$ to $200$, with object point noise $0.1$ meters and image point noise standard deviation $1$ pixel.
  • Figure 3: Estimation error vs noise standard deviation, the number of points is set to be $50$, and the object point noise increases from $0.02$ to $0.5$ meters, the corresponding image point noise varies from 0.2 pixels to 5 pixels accordingly.
  • Figure 4: Execution time comparison, methods implemented by C++ are included.
  • Figure 5: Initialize GMLPnP with ground truth + random offset (rotation and translation are added simultaneously) under different observation noise. The number of points is 200.
  • ...and 5 more figures

Theorems & Definitions (2)

  • Proposition 1
  • proof