Table of Contents
Fetching ...

Large Scale Photometric Bundle Adjustment

Oliver J. Woodford, Edward Rosten

TL;DR

This work tackles large-scale photometric bundle adjustment by jointly refining dense scene geometry and camera parameters under a lighting-robust NCC cost. It introduces a memory-efficient VarPro-based optimization that operates on thousands of cameras and millions of landmarks using a ray-based, plane landmark parameterization, enabling accurate reconstruction on internet-scale image collections. Evaluations on Tanks & Temples show substantial gains in metric reconstruction accuracy over COLMAP, with ablations identifying the contributions of joint optimization, intrinsic refinement, and robust cost. The approach offers a complementary pathway to MVS, improving inputs for subsequent dense reconstruction and potentially enhancing online and offline large-scale 3D modeling tasks.

Abstract

Direct methods have shown promise on visual odometry and SLAM, leading to greater accuracy and robustness over feature-based methods. However, offline 3-d reconstruction from internet images has not yet benefited from a joint, photometric optimization over dense geometry and camera parameters. Issues such as the lack of brightness constancy, and the sheer volume of data, make this a more challenging task. This work presents a framework for jointly optimizing millions of scene points and hundreds of camera poses and intrinsics, using a photometric cost that is invariant to local lighting changes. The improvement in metric reconstruction accuracy that it confers over feature-based bundle adjustment is demonstrated on the large-scale Tanks & Temples benchmark. We further demonstrate qualitative reconstruction improvements on an internet photo collection, with challenging diversity in lighting and camera intrinsics.

Large Scale Photometric Bundle Adjustment

TL;DR

This work tackles large-scale photometric bundle adjustment by jointly refining dense scene geometry and camera parameters under a lighting-robust NCC cost. It introduces a memory-efficient VarPro-based optimization that operates on thousands of cameras and millions of landmarks using a ray-based, plane landmark parameterization, enabling accurate reconstruction on internet-scale image collections. Evaluations on Tanks & Temples show substantial gains in metric reconstruction accuracy over COLMAP, with ablations identifying the contributions of joint optimization, intrinsic refinement, and robust cost. The approach offers a complementary pathway to MVS, improving inputs for subsequent dense reconstruction and potentially enhancing online and offline large-scale 3D modeling tasks.

Abstract

Direct methods have shown promise on visual odometry and SLAM, leading to greater accuracy and robustness over feature-based methods. However, offline 3-d reconstruction from internet images has not yet benefited from a joint, photometric optimization over dense geometry and camera parameters. Issues such as the lack of brightness constancy, and the sheer volume of data, make this a more challenging task. This work presents a framework for jointly optimizing millions of scene points and hundreds of camera poses and intrinsics, using a photometric cost that is invariant to local lighting changes. The improvement in metric reconstruction accuracy that it confers over feature-based bundle adjustment is demonstrated on the large-scale Tanks & Temples benchmark. We further demonstrate qualitative reconstruction improvements on an internet photo collection, with challenging diversity in lighting and camera intrinsics.

Paper Structure

This paper contains 14 sections, 8 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Given 700+ photos of Notre Dame ( a), captured with different cameras and lighting conditions, our method refines the camera poses (b, red), intrinsics, and dense geometry (c) produced by a standard SfM+MVS framework schonberger2016colmapsfmschonberger2016colmapdense, using a joint, photometric optimization. Both the new poses (b, black) and 3 landmarks (b) can be used to generate higher fidelity dense reconstructions of the scene, via Poisson meshing kazhdan2013poisson (d).
  • Figure 2: Low memory VarPro optimization
  • Figure 3: Precision error visualization for methods on the TT intermediate sets.