Table of Contents
Fetching ...

MVGS: Multi-view Regulated Gaussian Splatting for Novel View Synthesis

Xiaobiao Du, Yida Wang, Xin Yu

TL;DR

The paper addresses overfitting in single-view training for Gaussian-based novel view synthesis by introducing MVGS, a multi-view regulated training framework that unifies four mechanisms: multi-view supervision, cross-intrinsic guidance, cross-ray densification, and multi-view augmented densification. This approach constrains Gaussian kernels to capture multi-view structure, guides training from coarse to fine resolutions, and densifies Gaussians in ray-intersect regions to improve cross-view consistency. Empirical results across general, reflective, 4D, and large-scale scenes show significant NP improvements (often around 1 dB PSNR) when MVGS is integrated with existing Gaussian-based methods, demonstrating its universality and practical impact for high-fidelity, efficient NVS. The method enhances robustness to challenging factors like reflections, transparency, and dynamic changes, suggesting strong potential for real-world applications in graphics, VR, and autonomous systems.

Abstract

Recent works in volume rendering, \textit{e.g.} NeRF and 3D Gaussian Splatting (3DGS), significantly advance the rendering quality and efficiency with the help of the learned implicit neural radiance field or 3D Gaussians. Rendering on top of an explicit representation, the vanilla 3DGS and its variants deliver real-time efficiency by optimizing the parametric model with single-view supervision per iteration during training which is adopted from NeRF. Consequently, certain views are overfitted, leading to unsatisfying appearance in novel-view synthesis and imprecise 3D geometries. To solve aforementioned problems, we propose a new 3DGS optimization method embodying four key novel contributions: 1) We transform the conventional single-view training paradigm into a multi-view training strategy. With our proposed multi-view regulation, 3D Gaussian attributes are further optimized without overfitting certain training views. As a general solution, we improve the overall accuracy in a variety of scenarios and different Gaussian variants. 2) Inspired by the benefit introduced by additional views, we further propose a cross-intrinsic guidance scheme, leading to a coarse-to-fine training procedure concerning different resolutions. 3) Built on top of our multi-view regulated training, we further propose a cross-ray densification strategy, densifying more Gaussian kernels in the ray-intersect regions from a selection of views. 4) By further investigating the densification strategy, we found that the effect of densification should be enhanced when certain views are distinct dramatically. As a solution, we propose a novel multi-view augmented densification strategy, where 3D Gaussians are encouraged to get densified to a sufficient number accordingly, resulting in improved reconstruction accuracy.

MVGS: Multi-view Regulated Gaussian Splatting for Novel View Synthesis

TL;DR

The paper addresses overfitting in single-view training for Gaussian-based novel view synthesis by introducing MVGS, a multi-view regulated training framework that unifies four mechanisms: multi-view supervision, cross-intrinsic guidance, cross-ray densification, and multi-view augmented densification. This approach constrains Gaussian kernels to capture multi-view structure, guides training from coarse to fine resolutions, and densifies Gaussians in ray-intersect regions to improve cross-view consistency. Empirical results across general, reflective, 4D, and large-scale scenes show significant NP improvements (often around 1 dB PSNR) when MVGS is integrated with existing Gaussian-based methods, demonstrating its universality and practical impact for high-fidelity, efficient NVS. The method enhances robustness to challenging factors like reflections, transparency, and dynamic changes, suggesting strong potential for real-world applications in graphics, VR, and autonomous systems.

Abstract

Recent works in volume rendering, \textit{e.g.} NeRF and 3D Gaussian Splatting (3DGS), significantly advance the rendering quality and efficiency with the help of the learned implicit neural radiance field or 3D Gaussians. Rendering on top of an explicit representation, the vanilla 3DGS and its variants deliver real-time efficiency by optimizing the parametric model with single-view supervision per iteration during training which is adopted from NeRF. Consequently, certain views are overfitted, leading to unsatisfying appearance in novel-view synthesis and imprecise 3D geometries. To solve aforementioned problems, we propose a new 3DGS optimization method embodying four key novel contributions: 1) We transform the conventional single-view training paradigm into a multi-view training strategy. With our proposed multi-view regulation, 3D Gaussian attributes are further optimized without overfitting certain training views. As a general solution, we improve the overall accuracy in a variety of scenarios and different Gaussian variants. 2) Inspired by the benefit introduced by additional views, we further propose a cross-intrinsic guidance scheme, leading to a coarse-to-fine training procedure concerning different resolutions. 3) Built on top of our multi-view regulated training, we further propose a cross-ray densification strategy, densifying more Gaussian kernels in the ray-intersect regions from a selection of views. 4) By further investigating the densification strategy, we found that the effect of densification should be enhanced when certain views are distinct dramatically. As a solution, we propose a novel multi-view augmented densification strategy, where 3D Gaussians are encouraged to get densified to a sufficient number accordingly, resulting in improved reconstruction accuracy.
Paper Structure (19 sections, 4 equations, 12 figures, 9 tables)

This paper contains 19 sections, 4 equations, 12 figures, 9 tables.

Figures (12)

  • Figure 1: MVGS supplements general improvements for novel view synthesis on top of GaussianSplatting kerbl2023gaussiansplatting representations, as shown in (b) and (c). Extensive experiments are conducted to prove that our proposed method delivers consistent advantages in (d) in extremely challenging scenes with strong reflection, transparency, and fine-scale details against baseline methods.
  • Figure 2: Illustration of the previous single-view training paradigm and our proposed MVGS, where (a) describes NeRF cannot be optimized in a multi-view training way. (b) points out the original 3DGS following the single-view training strategy of NeRF. (c) The proposed MVGS transforms the original training protocol followed by 3DGS and its variants. (d) The proposed cross-intrinsic guidance strategy enables multi-view training in a coarse-to-fine way. The bottom of this figure illustrates the pipeline of our proposed MVGS.
  • Figure 3: Qualitative comparisons of 3DGS kerbl2023gaussiansplatting, Scaffold-GS lu2024scaffold and their improved version integrating our method across various datasets. We use red close-up patches to highlight the visual differences for clearer visibility. We can observe that our proposed method can improve the original 3DGS and Scaffold-GS for extremely challenging scenes with strongly changed lighting effects, powerful reflection, and fine details.
  • Figure 4: Qualitative results of 3DGS-DR 3dgsdr, 4DGS 4dgs and their improved version by integrating our method across various challenging datasets. It can be observed that 3DGS-DR and 4DGS integrated with our method can achieve better results for extremely challenging senses with strong reflection and dynamic changes.
  • Figure 5: Analysis of the multi-view training settings. We improve four representative state-of-the-art Gaussian-based methods with the proposed multi-view regulated training. We report results on three representative datasets.
  • ...and 7 more figures