Table of Contents
Fetching ...

Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian

Wei Sun, Qi Zhang, Yanzhao Zhou, Qixiang Ye, Jianbin Jiao, Yuan Li

TL;DR

This work introduces a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates, and integrates a patch-wise optimal transport strategy to complement traditional L2 loss in depth supervision.

Abstract

3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis. However, achieving successful reconstruction from RGB images generally requires multiple input views captured under static conditions. To address the challenge of sparse input views, previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting, using dense predictions from pretrained depth networks as pseudo-ground truth. Nevertheless, depth predictions from monocular depth estimation models inherently exhibit significant uncertainty in specific areas. Relying solely on pixel-wise L2 loss may inadvertently incorporate detrimental noise from these uncertain areas. In this work, we introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates. To address these localized errors in depth predictions, we integrate a patch-wise optimal transport strategy to complement traditional L2 loss in depth supervision. Extensive experiments conducted on the LLFF, DTU, and Blender datasets demonstrate that our approach, UGOT, achieves superior novel view synthesis and consistently outperforms state-of-the-art methods.

Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian

TL;DR

This work introduces a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates, and integrates a patch-wise optimal transport strategy to complement traditional L2 loss in depth supervision.

Abstract

3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis. However, achieving successful reconstruction from RGB images generally requires multiple input views captured under static conditions. To address the challenge of sparse input views, previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting, using dense predictions from pretrained depth networks as pseudo-ground truth. Nevertheless, depth predictions from monocular depth estimation models inherently exhibit significant uncertainty in specific areas. Relying solely on pixel-wise L2 loss may inadvertently incorporate detrimental noise from these uncertain areas. In this work, we introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates. To address these localized errors in depth predictions, we integrate a patch-wise optimal transport strategy to complement traditional L2 loss in depth supervision. Extensive experiments conducted on the LLFF, DTU, and Blender datasets demonstrate that our approach, UGOT, achieves superior novel view synthesis and consistently outperforms state-of-the-art methods.
Paper Structure (26 sections, 16 equations, 4 figures, 5 tables)

This paper contains 26 sections, 16 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The comparison with the previous method.
  • Figure 2: Overview of our method design.
  • Figure 3: Qualitative results of on DTU and LLFF with 3 input views. From left to right, they are sequentially the original image, the uncertainty map, the depth prediction, 3DGS, DNGaussian and ours.
  • Figure 4: Qualitative results of on Blender with 8 input views. From left to right, they are sequentially the original image, FreeNerf and ours.