Global Structure-from-Motion Revisited
Linfei Pan, Dániel Baráth, Marc Pollefeys, Johannes L. Schönberger
TL;DR
The paper tackles scalable, accurate 3D reconstruction from images by revisiting global Structure-from-Motion and introducing GLOMAP, a global SfM system that jointly estimates camera and 3D point positions directly from image rays rather than relying on separate translation averaging. Key contributions include a feature-track construction strategy, a unified global positioning objective with a robust, initialization-free formulation, and an accompanying global bundle adjustment, plus camera clustering to handle large internet-scale collections. Empirical results on calibrated and uncalibrated datasets show GLOMAP attaining accuracy comparable to state-of-the-art incremental SfM (e.g., COLMAP) while delivering orders-of-magnitude faster performance, with strong robustness to unknown intrinsics and sequential data. The work also provides extensive ablations, qualitative reconstructions, and a public code release, highlighting practical impact for scalable 3D mapping and novel-view synthesis applications.
Abstract
Recovering 3D structure and camera motion from images has been a long-standing focus of computer vision research and is known as Structure-from-Motion (SfM). Solutions to this problem are categorized into incremental and global approaches. Until now, the most popular systems follow the incremental paradigm due to its superior accuracy and robustness, while global approaches are drastically more scalable and efficient. With this work, we revisit the problem of global SfM and propose GLOMAP as a new general-purpose system that outperforms the state of the art in global SfM. In terms of accuracy and robustness, we achieve results on-par or superior to COLMAP, the most widely used incremental SfM, while being orders of magnitude faster. We share our system as an open-source implementation at {https://github.com/colmap/glomap}.
