Table of Contents
Fetching ...

Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis

Rui Peng, Wangze Xu, Luyang Tang, Liwei Liao, Jianbo Jiao, Ronggang Wang

TL;DR

The proposed SCGaussian, a Structure Consistent Gaussian Splatting method using matching priors to learn 3D consistent scene structure with state-of-the-art performance and high efficiency, is proposed.

Abstract

Despite the substantial progress of novel view synthesis, existing methods, either based on the Neural Radiance Fields (NeRF) or more recently 3D Gaussian Splatting (3DGS), suffer significant degradation when the input becomes sparse. Numerous efforts have been introduced to alleviate this problem, but they still struggle to synthesize satisfactory results efficiently, especially in the large scene. In this paper, we propose SCGaussian, a Structure Consistent Gaussian Splatting method using matching priors to learn 3D consistent scene structure. Considering the high interdependence of Gaussian attributes, we optimize the scene structure in two folds: rendering geometry and, more importantly, the position of Gaussian primitives, which is hard to be directly constrained in the vanilla 3DGS due to the non-structure property. To achieve this, we present a hybrid Gaussian representation. Besides the ordinary non-structure Gaussian primitives, our model also consists of ray-based Gaussian primitives that are bound to matching rays and whose optimization of their positions is restricted along the ray. Thus, we can utilize the matching correspondence to directly enforce the position of these Gaussian primitives to converge to the surface points where rays intersect. Extensive experiments on forward-facing, surrounding, and complex large scenes show the effectiveness of our approach with state-of-the-art performance and high efficiency. Code is available at https://github.com/prstrive/SCGaussian.

Structure Consistent Gaussian Splatting with Matching Prior for Few-shot Novel View Synthesis

TL;DR

The proposed SCGaussian, a Structure Consistent Gaussian Splatting method using matching priors to learn 3D consistent scene structure with state-of-the-art performance and high efficiency, is proposed.

Abstract

Despite the substantial progress of novel view synthesis, existing methods, either based on the Neural Radiance Fields (NeRF) or more recently 3D Gaussian Splatting (3DGS), suffer significant degradation when the input becomes sparse. Numerous efforts have been introduced to alleviate this problem, but they still struggle to synthesize satisfactory results efficiently, especially in the large scene. In this paper, we propose SCGaussian, a Structure Consistent Gaussian Splatting method using matching priors to learn 3D consistent scene structure. Considering the high interdependence of Gaussian attributes, we optimize the scene structure in two folds: rendering geometry and, more importantly, the position of Gaussian primitives, which is hard to be directly constrained in the vanilla 3DGS due to the non-structure property. To achieve this, we present a hybrid Gaussian representation. Besides the ordinary non-structure Gaussian primitives, our model also consists of ray-based Gaussian primitives that are bound to matching rays and whose optimization of their positions is restricted along the ray. Thus, we can utilize the matching correspondence to directly enforce the position of these Gaussian primitives to converge to the surface points where rays intersect. Extensive experiments on forward-facing, surrounding, and complex large scenes show the effectiveness of our approach with state-of-the-art performance and high efficiency. Code is available at https://github.com/prstrive/SCGaussian.

Paper Structure

This paper contains 20 sections, 14 equations, 13 figures, 11 tables.

Figures (13)

  • Figure 1: Comparisons in view synthesis and geometry rendering. 3DGS kerbl20233d can synthesize high-quality novel views and plausible geometry with excessive inputs, but suffers from significant degradation in the sparse scenario. Even using the monocular depth prior, DNGaussian li2024dngaussian still struggles to generate accurate geometry and novel views. In contrast, our method can learn the more consistent scene structure and render the more realistic images.
  • Figure 2: Framework of SCGaussian. We first extract the matching prior from the sparse input, and randomly initialize the hybrid Gaussian representation. The ray-based Gaussian primitives are bound to matching rays, and are explicitly optimized using the matching correspondence. The rendering geometry optimization is further conducted to optimize the shape of all types of Gaussian primitives. Combined with the ordinary photometric loss, SCGaussian can learn the consistent scene structure.
  • Figure 3: Visualization of some challenges faced by few-shot 3DGS. (a) The expected Gaussian in the surface region cannot be learned, and the model tends to learn the inconsistent Gaussian and overfit the training views. While the training loss is small enough, the testing error is pretty bad. (b) The attributes of Gaussian primitives are interdependent and the model tends to increase the size to cover the pixels rather than correct the position.
  • Figure 4: Qualitative comparisons on LLFF (first two rows) and IBRNet (last two rows) datasets with 3 training views. The reconstruction of our method is more accurate and exhibits finer details.
  • Figure 5: Qualitative comparisons on Tanks and Temples dataset with 3 training views.
  • ...and 8 more figures