Towards Rotation-only Imaging Geometry: Rotation Estimation
Xinrui Li, Qi Cai, Yuanxin Wu
TL;DR
This work proposes a rotation-only imaging geometry for SfM by representing translation on the rotation manifold and optimizing reprojection error solely over camera rotations. It introduces PPO constraints, analyzes translation solution spaces under different scene structures, and derives TRRM to perform two-view rotation optimization, plus GRRM for global multi-view rotation estimation. A scene-structure detector identifies RotationSingular configurations to avoid translation degeneration (PR/B/I, Holoplane, RankRegular-line), improving robustness. The approach yields substantial accuracy gains over state-of-the-art rotation estimation methods, with performance approaching four rounds of BA on OpenMVG pipelines, and demonstrates strong two-view and multi-view results on simulations and the Strecha dataset. Overall, the rotation-only framework offers improved efficiency, robustness to noise, and competitive accuracy for 3D visual computing pipelines.
Abstract
Structure from Motion (SfM) is a critical task in computer vision, aiming to recover the 3D scene structure and camera motion from a sequence of 2D images. The recent pose-only imaging geometry decouples 3D coordinates from camera poses and demonstrates significantly better SfM performance through pose adjustment. Continuing the pose-only perspective, this paper explores the critical relationship between the scene structures, rotation and translation. Notably, the translation can be expressed in terms of rotation, allowing us to condense the imaging geometry representation onto the rotation manifold. A rotation-only optimization framework based on reprojection error is proposed for both two-view and multi-view scenarios. The experiment results demonstrate superior accuracy and robustness performance over the current state-of-the-art rotation estimation methods, even comparable to multiple bundle adjustment iteration results. Hopefully, this work contributes to even more accurate, efficient and reliable 3D visual computing.
