Table of Contents
Fetching ...

CSS: Overcoming Pose and Scene Challenges in Crowd-Sourced 3D Gaussian Splatting

Runze Chen, Mingyu Xiao, Haiyong Luo, Fang Zhao, Fan Wu, Hao Xiong, Qi Liu, Meng Song

TL;DR

Crowd-Sourced Splatting (CSS) is introduced, a novel 3D Gaussian Splatting (3DGS) pipeline designed to overcome the challenges of pose-free scene reconstruction using crowd-sourced imagery, enabling high-quality novel view synthesis under complex, real-world conditions.

Abstract

We introduce Crowd-Sourced Splatting (CSS), a novel 3D Gaussian Splatting (3DGS) pipeline designed to overcome the challenges of pose-free scene reconstruction using crowd-sourced imagery. The dream of reconstructing historically significant but inaccessible scenes from collections of photographs has long captivated researchers. However, traditional 3D techniques struggle with missing camera poses, limited viewpoints, and inconsistent lighting. CSS addresses these challenges through robust geometric priors and advanced illumination modeling, enabling high-quality novel view synthesis under complex, real-world conditions. Our method demonstrates clear improvements over existing approaches, paving the way for more accurate and flexible applications in AR, VR, and large-scale 3D reconstruction.

CSS: Overcoming Pose and Scene Challenges in Crowd-Sourced 3D Gaussian Splatting

TL;DR

Crowd-Sourced Splatting (CSS) is introduced, a novel 3D Gaussian Splatting (3DGS) pipeline designed to overcome the challenges of pose-free scene reconstruction using crowd-sourced imagery, enabling high-quality novel view synthesis under complex, real-world conditions.

Abstract

We introduce Crowd-Sourced Splatting (CSS), a novel 3D Gaussian Splatting (3DGS) pipeline designed to overcome the challenges of pose-free scene reconstruction using crowd-sourced imagery. The dream of reconstructing historically significant but inaccessible scenes from collections of photographs has long captivated researchers. However, traditional 3D techniques struggle with missing camera poses, limited viewpoints, and inconsistent lighting. CSS addresses these challenges through robust geometric priors and advanced illumination modeling, enabling high-quality novel view synthesis under complex, real-world conditions. Our method demonstrates clear improvements over existing approaches, paving the way for more accurate and flexible applications in AR, VR, and large-scale 3D reconstruction.
Paper Structure (7 sections, 7 equations, 2 figures, 1 table)

This paper contains 7 sections, 7 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: CSS computational pipeline. We employ multiview stereo estimation to determine the orientation of each crowdsourced viewpoint $\tilde{\mathbf{P}}^{(i)}$, alongside a confidence map $\hat{\mathbf{C}}^{(i)}$ and a corresponding point cloud $\tilde{\mathbf{X}}^{(i)}$. The covariance $\mathrm{cov}(\langle \tilde{\mathbf{X}}^{(1)},\mathbf{u}_1\rangle )$ is calculated using the adjacent points within the point cloud to initialize the Gaussian distribution. Throughout the 3D Gaussian refinement process, we model the illumination from each crowdsourced viewpoint $i$ using higher-order spherical harmonics, which allows us to render the scene effectively and construct a stable and coherent novel viewpoint synthesis.
  • Figure 2: Comparison of object appearance rendering across different methods. Despite varying lighting conditions due to crowd-sourced views, our method achieves more accurate structural and texture preservation than others.