Table of Contents
Fetching ...

A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds

Jizong Peng, Tze Ho Elden Tse, Kai Xu, Wenchao Gao, Angela Yao

TL;DR

The paper tackles robust 3D reconstruction with Gaussian Splatting when camera poses and point clouds are noisy and only coarsely estimated. It introduces a constrained optimization framework that decouples camera pose into device-centered and world-centered components, and jointly refines intrinsics, extrinsics, and the 3DGS representation using sensitivity-based pre-conditioning and a log-barrier to stay within feasible regions. Two geometric priors—a soft epipolar constraint and a reprojection regularizer—regularize the optimization, aided by line-intersection depth estimation, improving rendering fidelity. A new multimodal SLAM dataset is collected to evaluate the method, and results demonstrate substantial improvements over COLMAP-based approaches and competitive performance with significantly reduced preprocessing time, highlighting practical applicability for large-scale, noisy multi-camera scenes.

Abstract

3D Gaussian Splatting (3DGS) is a powerful reconstruction technique, but it needs to be initialized from accurate camera poses and high-fidelity point clouds. Typically, the initialization is taken from Structure-from-Motion (SfM) algorithms; however, SfM is time-consuming and restricts the application of 3DGS in real-world scenarios and large-scale scene reconstruction. We introduce a constrained optimization method for simultaneous camera pose estimation and 3D reconstruction that does not require SfM support. Core to our approach is decomposing a camera pose into a sequence of camera-to-(device-)center and (device-)center-to-world optimizations. To facilitate, we propose two optimization constraints conditioned to the sensitivity of each parameter group and restricts each parameter's search space. In addition, as we learn the scene geometry directly from the noisy point clouds, we propose geometric constraints to improve the reconstruction quality. Experiments demonstrate that the proposed method significantly outperforms the existing (multi-modal) 3DGS baseline and methods supplemented by COLMAP on both our collected dataset and two public benchmarks.

A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds

TL;DR

The paper tackles robust 3D reconstruction with Gaussian Splatting when camera poses and point clouds are noisy and only coarsely estimated. It introduces a constrained optimization framework that decouples camera pose into device-centered and world-centered components, and jointly refines intrinsics, extrinsics, and the 3DGS representation using sensitivity-based pre-conditioning and a log-barrier to stay within feasible regions. Two geometric priors—a soft epipolar constraint and a reprojection regularizer—regularize the optimization, aided by line-intersection depth estimation, improving rendering fidelity. A new multimodal SLAM dataset is collected to evaluate the method, and results demonstrate substantial improvements over COLMAP-based approaches and competitive performance with significantly reduced preprocessing time, highlighting practical applicability for large-scale, noisy multi-camera scenes.

Abstract

3D Gaussian Splatting (3DGS) is a powerful reconstruction technique, but it needs to be initialized from accurate camera poses and high-fidelity point clouds. Typically, the initialization is taken from Structure-from-Motion (SfM) algorithms; however, SfM is time-consuming and restricts the application of 3DGS in real-world scenarios and large-scale scene reconstruction. We introduce a constrained optimization method for simultaneous camera pose estimation and 3D reconstruction that does not require SfM support. Core to our approach is decomposing a camera pose into a sequence of camera-to-(device-)center and (device-)center-to-world optimizations. To facilitate, we propose two optimization constraints conditioned to the sensitivity of each parameter group and restricts each parameter's search space. In addition, as we learn the scene geometry directly from the noisy point clouds, we propose geometric constraints to improve the reconstruction quality. Experiments demonstrate that the proposed method significantly outperforms the existing (multi-modal) 3DGS baseline and methods supplemented by COLMAP on both our collected dataset and two public benchmarks.

Paper Structure

This paper contains 21 sections, 17 equations, 17 figures, 9 tables.

Figures (17)

  • Figure 1: Given noisy point clouds and inaccurate camera poses, our constrained optimization approach reconstructs the 3D scene in Gaussian Splatting with high visual quality, which enables various downstream applications.
  • Figure 2: Qualitative example of camera poses and colored point clouds obtained from our multi-camera SLAM system.
  • Figure 3: Illustration of camera intrinsic optimization. (a) In monocular setting, inaccurate intrinsic parameters could be corrected by adjusting the camera pose, e.g. shifting the camera origin right by $T$. (b) This approach is not feasible for multi-cameras under extrinsic constraints like autonomous cars or SLAM devices.
  • Figure 4: Illustration of our camera decomposition scheme. (a) Initial noisy point cloud from SLAM setup. (b) and (d) Optimization procedures of device-to-world and camera-to-device transformations. (c) Refined point cloud from our constrained optimization approach, showing improved visual quality.
  • Figure 5: Illustration of the log-barrier method. Lower and upper bounds are predefined based on initial SLAM estimation. At the start of the optimization, the barrier imposes a strong penalty for significant deviations from the initial estimate. As temperature increases, it transforms into a well-function, allowing the parameter to fully explore the feasible region.
  • ...and 12 more figures