Table of Contents
Fetching ...

Pseudo-View Enhancement via Confidence Fusion for Unposed Sparse-View Reconstruction

Beizhen Zhao, Sicheng Yu, Guanzhi Ding, Yu Hu, Hao Wang

TL;DR

A bidirectional pseudo frame restoration method that restores missing content by diffusion-based synthesis guided by adjacent frames with a lightweight pseudo-view deblur model and confidence mask inference algorithm and a scene perception Gaussian management strategy that optimize Gaussians based on joint depth-density information.

Abstract

3D scene reconstruction under unposed sparse viewpoints is a highly challenging yet practically important problem, especially in outdoor scenes due to complex lighting and scale variation. With extremely limited input views, directly utilizing diffusion model to synthesize pseudo frames will introduce unreasonable geometry, which will harm the final reconstruction quality. To address these issues, we propose a novel framework for sparse-view outdoor reconstruction that achieves high-quality results through bidirectional pseudo frame restoration and scene perception Gaussian management. Specifically, we introduce a bidirectional pseudo frame restoration method that restores missing content by diffusion-based synthesis guided by adjacent frames with a lightweight pseudo-view deblur model and confidence mask inference algorithm. Then we propose a scene perception Gaussian management strategy that optimize Gaussians based on joint depth-density information. These designs significantly enhance reconstruction completeness, suppress floating artifacts and improve overall geometric consistency under extreme view sparsity. Experiments on outdoor benchmarks demonstrate substantial gains over existing methods in both fidelity and stability.

Pseudo-View Enhancement via Confidence Fusion for Unposed Sparse-View Reconstruction

TL;DR

A bidirectional pseudo frame restoration method that restores missing content by diffusion-based synthesis guided by adjacent frames with a lightweight pseudo-view deblur model and confidence mask inference algorithm and a scene perception Gaussian management strategy that optimize Gaussians based on joint depth-density information.

Abstract

3D scene reconstruction under unposed sparse viewpoints is a highly challenging yet practically important problem, especially in outdoor scenes due to complex lighting and scale variation. With extremely limited input views, directly utilizing diffusion model to synthesize pseudo frames will introduce unreasonable geometry, which will harm the final reconstruction quality. To address these issues, we propose a novel framework for sparse-view outdoor reconstruction that achieves high-quality results through bidirectional pseudo frame restoration and scene perception Gaussian management. Specifically, we introduce a bidirectional pseudo frame restoration method that restores missing content by diffusion-based synthesis guided by adjacent frames with a lightweight pseudo-view deblur model and confidence mask inference algorithm. Then we propose a scene perception Gaussian management strategy that optimize Gaussians based on joint depth-density information. These designs significantly enhance reconstruction completeness, suppress floating artifacts and improve overall geometric consistency under extreme view sparsity. Experiments on outdoor benchmarks demonstrate substantial gains over existing methods in both fidelity and stability.
Paper Structure (32 sections, 20 equations, 13 figures, 6 tables)

This paper contains 32 sections, 20 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Overview of our pipeline for 3D Gaussian Splatting from unposed sparse views in outdoor scenes. We utilize a cross-view inconsistency UNet and a diffusion model for pesudo view completion. Through a confidence mask and scene perception Gaussian optimization, our method outperforms others in reconstruction quality.
  • Figure 2: Framework of BRPO. We begin by the sparse input views and through the cross-view vision to select and restore frame. We design a lightweight pseudo-view deblur UNet to denoise the Gaussian rendered image and utilize a diffusion-based model for image completion. After that, we combine the two fixed images by calculating reprojection overlap score and conduct feature mapping to generate a confidence mask to guide the joint optimization.
  • Figure 3: Visual quality on KITTI dataset. Our approach consistently outperforms other models on different scenes, demonstrating advantages in challenging scenarios. Best viewed in color.
  • Figure 4: Visual quality on DL3DV dataset. Our approach consistently outperforms other models on different scenes, demonstrating advantages in challenging scenarios. Best viewed in color.
  • Figure 5: Pose Estimation Visualization on KITTI and DL3DV dataset. Our method can reach more accurate pose estimation compared to other methods.
  • ...and 8 more figures