Table of Contents
Fetching ...

VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction

Hanlin Chen, Fangyin Wei, Chen Li, Tianxin Huang, Yunsong Wang, Gim Hee Lee

TL;DR

A Depth-Normal regularizer is proposed that directly couples normal with other geometric parameters, leading to full updates of the geometric parameters from normal regularization, and a confidence term is proposed to mitigate inconsistencies of normal predictions across multiple views.

Abstract

Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation. Previous works improve the surface by incorporating geometric priors from the off-the-shelf normal estimator. However, there are two main limitations: 1) Supervising normals rendered from 3D Gaussians effectively updates the rotation parameter but is less effective for other geometric parameters; 2) The inconsistency of predicted normal maps across multiple views may lead to severe reconstruction artifacts. In this paper, we propose a Depth-Normal regularizer that directly couples normal with other geometric parameters, leading to full updates of the geometric parameters from normal regularization. We further propose a confidence term to mitigate inconsistencies of normal predictions across multiple views. Moreover, we also introduce a densification and splitting strategy to regularize the size and distribution of 3D Gaussians for more accurate surface modeling. Compared with Gaussian-based baselines, experiments show that our approach obtains better reconstruction quality and maintains competitive appearance quality at faster training speed and 100+ FPS rendering.

VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction

TL;DR

A Depth-Normal regularizer is proposed that directly couples normal with other geometric parameters, leading to full updates of the geometric parameters from normal regularization, and a confidence term is proposed to mitigate inconsistencies of normal predictions across multiple views.

Abstract

Although 3D Gaussian Splatting has been widely studied because of its realistic and efficient novel-view synthesis, it is still challenging to extract a high-quality surface from the point-based representation. Previous works improve the surface by incorporating geometric priors from the off-the-shelf normal estimator. However, there are two main limitations: 1) Supervising normals rendered from 3D Gaussians effectively updates the rotation parameter but is less effective for other geometric parameters; 2) The inconsistency of predicted normal maps across multiple views may lead to severe reconstruction artifacts. In this paper, we propose a Depth-Normal regularizer that directly couples normal with other geometric parameters, leading to full updates of the geometric parameters from normal regularization. We further propose a confidence term to mitigate inconsistencies of normal predictions across multiple views. Moreover, we also introduce a densification and splitting strategy to regularize the size and distribution of 3D Gaussians for more accurate surface modeling. Compared with Gaussian-based baselines, experiments show that our approach obtains better reconstruction quality and maintains competitive appearance quality at faster training speed and 100+ FPS rendering.
Paper Structure (19 sections, 17 equations, 13 figures, 6 tables)

This paper contains 19 sections, 17 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: View-Consistent D-Normal Regularizer. Pseudo normals predicted from pretrained monocular normal estimators tend to be inconsistent across different views (left). Our method calculates a confidence map indicating the confidence of the pseudo normals (middle). The confidence is used to weigh the loss imposed on our proposed D-Normals. Our method achieves new state-of-the-art surface reconstruction results and rendering quality comparable with prior work.
  • Figure 2: Illustration of rendered normal supervision and the D-Normal regularizer. (a) As a result of the back-propagation through alpha-blending via Eq. \ref{['eq:gaussian_dist']}, rendered normal supervision $\mathcal{L}_{\text{n}}$ moves Gaussians closer to ($\textbf{P}_1$) or away from ($\textbf{P}_2$) the intersecting ray. When the normal of a Gaussian is closer to the GT surface normal, the supervision pushes this Gaussian ($\textbf{P}_1$) towards the ray to increase its weight in the rendering equation, and vice-versa ($\textbf{P}_2$). (b) Such movement of Gaussians stops when the rendered normal loss $\mathcal{L}_{\text{n}}$ is equal to zero. In either case ((a) or (b)), the rendered normal loss cannot move Gaussian towards the surface. In contrast, (c) the D-Normal regularizer $\mathcal{L}_{\text{dn}}$ can move Gaussians towards or away from GT surface. $\textbf{P}_1$ and $\textbf{P}_2$ are the 3D positions corresponding to the mean depth of two neighboring pixels (rays) via Eq. \ref{['eq:depth']}. The D-Normal $\bar{\textbf{N}}_d$ is derived from $\textbf{P}_1$ and $\textbf{P}_2$ in Eq. \ref{['eq:normal_depth']}. $\mathcal{L}_{\text{dn}}$ encourages $\bar{\textbf{N}}_d$ to align with the ground truth normal $\textbf{N}$, resulting in Gaussians moving towards or away from the surface.
  • Figure 3: Overview of our VCR-GauS. During densification and splitting, our method only keeps the Gaussians at the first intersections and splits large Gaussians into smaller ones along the major principle axis. The rendered normals are supervised with pseudo normals predicted from a pretrained monocular normal estimator in $\mathcal{L}_{\text{n}}$. We further calculate an uncertainty map based on the discrepancies between the rendered and pseudo normals (cf. Eq. \ref{['eq:w']}) to weigh the loss $\mathcal{L}_{\text{dn}}$ between pseudo normals and D-Normals derived from the rendered depth maps. We compare different approaches for normal calculation (Top Right) and show our intersection depth (Bottom Right).
  • Figure 4: Illustration of the rationals behind the densification and splitting strategies. (a) Comparison between large and small Gaussians of depth errors caused by a small normal error (in side view). (b) Comparison of the original and the proposed splitting strategies (in bird-eye view).
  • Figure 5: Qualitative comparison on TNT dataset. From top to bottom, we show the reconstructed meshes from our method, SuGar, 2DGS, and NeuS, as well as the ground truth colored point cloud. Our method reconstructs more complete surfaces featuring smoother planar regions and finer details.
  • ...and 8 more figures