Table of Contents
Fetching ...

Practical solutions to the relative pose of three calibrated cameras

Charalambos Tzamos, Viktor Kocur, Yaqing Ding, Daniel Barath, Zuzana Berger Haladova, Torsten Sattler, Zuzana Kukelova

TL;DR

Estimating the relative pose of three calibrated cameras from four point correspondences is highly challenging due to complex minimal configurations. The authors introduce two practical approximate solvers, 4p3v(A) and 4p3v(M), which first infer an approximate two-view geometry from four correspondences and then register the third view with a P3P solver, enabling efficient RANSAC-based estimation. They further enhance robustness with ENM refitting, mean-point delta sampling, filtering, and Levenberg–Marquardt refinement, achieving state-of-the-art accuracy on real data. The approach is simple to implement using existing solvers and demonstrates strong robustness across diverse scenes and RANSAC configurations, making it well-suited for real-world three-view camera geometry estimation.

Abstract

We study the challenging problem of estimating the relative pose of three calibrated cameras from four point correspondences. We propose novel efficient solutions to this problem that are based on the simple idea of using four correspondences to estimate an approximate geometry of the first two views. We model this geometry either as an affine or a fully perspective geometry estimated using one additional approximate correspondence. We generate such an approximate correspondence using a very simple and efficient strategy, where the new point is the mean point of three corresponding input points. The new solvers are efficient and easy to implement, since they are based on existing efficient minimal solvers, i.e., the 4-point affine fundamental matrix, the well-known 5-point relative pose solver, and the P3P solver. Extensive experiments on real data show that the proposed solvers, when properly coupled with local optimization, achieve state-of-the-art results, with the novel solver based on approximate mean-point correspondences being more robust and accurate than the affine-based solver.

Practical solutions to the relative pose of three calibrated cameras

TL;DR

Estimating the relative pose of three calibrated cameras from four point correspondences is highly challenging due to complex minimal configurations. The authors introduce two practical approximate solvers, 4p3v(A) and 4p3v(M), which first infer an approximate two-view geometry from four correspondences and then register the third view with a P3P solver, enabling efficient RANSAC-based estimation. They further enhance robustness with ENM refitting, mean-point delta sampling, filtering, and Levenberg–Marquardt refinement, achieving state-of-the-art accuracy on real data. The approach is simple to implement using existing solvers and demonstrates strong robustness across diverse scenes and RANSAC configurations, making it well-suited for real-world three-view camera geometry estimation.

Abstract

We study the challenging problem of estimating the relative pose of three calibrated cameras from four point correspondences. We propose novel efficient solutions to this problem that are based on the simple idea of using four correspondences to estimate an approximate geometry of the first two views. We model this geometry either as an affine or a fully perspective geometry estimated using one additional approximate correspondence. We generate such an approximate correspondence using a very simple and efficient strategy, where the new point is the mean point of three corresponding input points. The new solvers are efficient and easy to implement, since they are based on existing efficient minimal solvers, i.e., the 4-point affine fundamental matrix, the well-known 5-point relative pose solver, and the P3P solver. Extensive experiments on real data show that the proposed solvers, when properly coupled with local optimization, achieve state-of-the-art results, with the novel solver based on approximate mean-point correspondences being more robust and accurate than the affine-based solver.
Paper Structure (22 sections, 1 theorem, 4 equations, 19 figures, 7 tables)

This paper contains 22 sections, 1 theorem, 4 equations, 19 figures, 7 tables.

Key Result

Lemma 1

Let us assume two cameras with camera centers $\mathbf{C}^1$ and $\mathbf{C}^2$ that observe 3D points $X_i, X_j,$ and $X_k$ (see Figure fig:illustration for an illustration). Let $\left\{\mathbf{x}_i^1,\mathbf{x}_j^1,\mathbf{x}_k^1\right\}$ and $\left\{\mathbf{x}_i^2,\mathbf{x}_j^2,\mathbf{x}_k^2\r

Figures (19)

  • Figure 1: Visualization of the four-points-in-three-views (4p3v) problem and our solution based on using four correspondences to efficiently estimate an approximate geometry of the first two views and then register the third view using a P3P solver lambda-twist.
  • Figure 2: Results of a synthetic experiment measuring the accuracy of two-view variants of our solvers depending on the angle between the principal axes of the cameras.
  • Figure 3: Distribution of the (left) rotation error (0.3373, 0.3349); and (right) percentage of inliers gathered (0.3266, 0.3434), as a function of the barycentric coordinates of the triangle in the second image w.r.t. the mean point of the corresponding triangle in the first image on 465k four-tuples of correspondences from St. Peter's Square scene from the PhotoTourism dataset IMC2020. We fit a 2D Gaussian distribution to the results and report the mean in brackets.
  • Figure 4: Speed-accuracy trade-off for (a) all scenes from PhotoTourism IMC2020 except St. Peter's Square, (b) 5 scenes from Cambridge Landmarks kendall2015cambridge, (c) Aachen Day-Night v1.1 zhang2021aachen, and (d) St. Mary's Church scene from the Cambridge Landmarks dataset kendall2015cambridge. We report the AUC@10$^\circ$ of the pose error and vary the number of PoseLib RANSAC iterations (100, 200, 500, 1000, 2000, 5000, 10,000) with a 5 px epipolar threshold. Runtimes are averaged over all image triplets.
  • Figure 5: Illustration of the geometric configuration considered in the proof of Lemma \ref{['lemma:inter']}.
  • ...and 14 more figures

Theorems & Definitions (2)

  • Lemma 1
  • proof