Table of Contents
Fetching ...

MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification

Zhuoxiao Li, Shanliang Yao, Yijie Chu, Angel F. Garcia-Fernandez, Yong Yue, Eng Gee Lim, Xiaohui Zhu

TL;DR

MVG-Splatting tackles depth and geometry inconsistencies in Gaussian Splatting by introducing depth-normal refinement and an adaptive, multi-view guided densification strategy. It fuses refined depth maps and normals with image gradients, guided by a multi-view geometric consistency framework and KDE-FFT based depth quantile segmentation to densify near and far regions, enabling direct mesh extraction via Marching Cubes from a dense Gaussian point cloud. A joint loss combining RGB rendering, edge-aware depth, perceptual features, and normal consistency supervises depth and rendering quality. Across Mip-NeRF 360, UrbanScene3D, and Tanks/Temples datasets, MVG-Splatting achieves superior rendering fidelity and mesh detail with competitive or superior NVS metrics and more efficient training than several baselines, demonstrating strong scalability and practical impact for high-fidelity 3D reconstruction.

Abstract

In the rapidly evolving field of 3D reconstruction, 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) represent significant advancements. Although 2DGS compresses 3D Gaussian primitives into 2D Gaussian surfels to effectively enhance mesh extraction quality, this compression can potentially lead to a decrease in rendering quality. Additionally, unreliable densification processes and the calculation of depth through the accumulation of opacity can compromise the detail of mesh extraction. To address this issue, we introduce MVG-Splatting, a solution guided by Multi-View considerations. Specifically, we integrate an optimized method for calculating normals, which, combined with image gradients, helps rectify inconsistencies in the original depth computations. Additionally, utilizing projection strategies akin to those in Multi-View Stereo (MVS), we propose an adaptive quantile-based method that dynamically determines the level of additional densification guided by depth maps, from coarse to fine detail. Experimental evidence demonstrates that our method not only resolves the issues of rendering quality degradation caused by depth discrepancies but also facilitates direct mesh extraction from dense Gaussian point clouds using the Marching Cubes algorithm. This approach significantly enhances the overall fidelity and accuracy of the 3D reconstruction process, ensuring that both the geometric details and visual quality.

MVG-Splatting: Multi-View Guided Gaussian Splatting with Adaptive Quantile-Based Geometric Consistency Densification

TL;DR

MVG-Splatting tackles depth and geometry inconsistencies in Gaussian Splatting by introducing depth-normal refinement and an adaptive, multi-view guided densification strategy. It fuses refined depth maps and normals with image gradients, guided by a multi-view geometric consistency framework and KDE-FFT based depth quantile segmentation to densify near and far regions, enabling direct mesh extraction via Marching Cubes from a dense Gaussian point cloud. A joint loss combining RGB rendering, edge-aware depth, perceptual features, and normal consistency supervises depth and rendering quality. Across Mip-NeRF 360, UrbanScene3D, and Tanks/Temples datasets, MVG-Splatting achieves superior rendering fidelity and mesh detail with competitive or superior NVS metrics and more efficient training than several baselines, demonstrating strong scalability and practical impact for high-fidelity 3D reconstruction.

Abstract

In the rapidly evolving field of 3D reconstruction, 3D Gaussian Splatting (3DGS) and 2D Gaussian Splatting (2DGS) represent significant advancements. Although 2DGS compresses 3D Gaussian primitives into 2D Gaussian surfels to effectively enhance mesh extraction quality, this compression can potentially lead to a decrease in rendering quality. Additionally, unreliable densification processes and the calculation of depth through the accumulation of opacity can compromise the detail of mesh extraction. To address this issue, we introduce MVG-Splatting, a solution guided by Multi-View considerations. Specifically, we integrate an optimized method for calculating normals, which, combined with image gradients, helps rectify inconsistencies in the original depth computations. Additionally, utilizing projection strategies akin to those in Multi-View Stereo (MVS), we propose an adaptive quantile-based method that dynamically determines the level of additional densification guided by depth maps, from coarse to fine detail. Experimental evidence demonstrates that our method not only resolves the issues of rendering quality degradation caused by depth discrepancies but also facilitates direct mesh extraction from dense Gaussian point clouds using the Marching Cubes algorithm. This approach significantly enhances the overall fidelity and accuracy of the 3D reconstruction process, ensuring that both the geometric details and visual quality.
Paper Structure (26 sections, 19 equations, 15 figures, 6 tables, 3 algorithms)

This paper contains 26 sections, 19 equations, 15 figures, 6 tables, 3 algorithms.

Figures (15)

  • Figure 1: MVG-Splatting represents the scene with denser Gaussian point clouds. We design an adaptive densification method guided by multi-view inputs for the GS-based capability to render depth, directing densification in areas that are under-reconstructed and require intensive reconstruction. From top to bottom: the rendered scene, the Gaussian point cloud, and the extracted mesh. Our method (d), through more uniform densification, can directly utilize the Marching Cubes (MC) method to extract detailed meshes.
  • Figure 2: Quantitative and Qualitative Visualizations of 2DGS's Depth and Normal. The three images represent the vase's Ground Truth image, surface depth map, and surface normal map, respectively. For a detailed quantitative analysis, depth values were extracted from three specific pixels on the vase, highlighted by boxes colored in white, black, and red. The depth values obtained for these pixels are 11.064, 3.503, and 8.913, respectively.
  • Figure 3: Overview of MVG-Splatting. We propose an adaptive densification method based on multi-view geometric consistency, guiding the optimized depth maps to achieve densification through scene projection. Unlike the original GS-based training pipeline, our method initially generates matched photographs based on multi-view principles and guides the optimization and projection of rendered depths during training.
  • Figure 4: Adaptive Quantile-Based Segmentation. We propose an adaptive method based on Kernel Density Estimation (KDE) and Fast Fourier Transform (FFT). This method dynamically estimates the depth map distribution and determines the quantile threshold for the depth based densification in the next step. From top to bottom: Mip-NeRF 360 Bicycle scene and Tanks and Temple Bran scene.
  • Figure 5: Densification Strategy. We present a depth map-based densification method to enhance rendering and mesh extraction quality. (a) We first identify under-reconstructed areas and acquire surface normals. (b) Through multi-view geometric consistency, depth maps are projected onto these areas. (c) Using surface normals, the orientation of projected primitives is adjusted to be perpendicular to the normals, enhancing alignment. The scale of all primitives is reinitialized for improved rendering accuracy. (d) An example from a single viewpoint shows that the projected area achieves more uniform and precise densification compared to unprojected areas.
  • ...and 10 more figures