Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object

Yuxuan Lin; Ruihang Chu; Zhenyu Chen; Xiao Tang; Lei Ke; Haoling Li; Yingji Zhong; Zhihao Li; Shiyong Liu; Xiaofei Wu; Jianzhuang Liu; Yujiu Yang

Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object

Yuxuan Lin, Ruihang Chu, Zhenyu Chen, Xiao Tang, Lei Ke, Haoling Li, Yingji Zhong, Zhihao Li, Shiyong Liu, Xiaofei Wu, Jianzhuang Liu, Yujiu Yang

TL;DR

Zero-P-to-3 tackles 3D reconstruction from partial observations by starting from a coarse 3D Gaussian Splatting (3DGS) model and refining it through a training-free framework that blends multi-view diffusion priors with a geometric prior and an image-restoration prior. It introduces an iterative rotated-view refinement strategy and a composite loss $L_{\text{total}} = L_{\text{rec}} + \lambda L_{\text{LPIPS}}$ to supervise Gaussian parameters, while a diffusion-based multi-prior fusion yields $\varepsilon_t = \varepsilon_{\text{MVD}} + w_{\text{HF}}(t) \varepsilon_{\text{HF}}^t + w_{\text{LF}}(t) \varepsilon_{\text{LF}}^t$ with time-dependent weights. The sampling follows DDIM with $ x_{t-1} = x_t + (\sqrt{\alpha_{t-1}} - \sqrt{\alpha_t}) \frac{\varepsilon_t}{\sqrt{1-\alpha_t}} $, enabling coherent novel-view synthesis even in unseen regions. Experiments on synthetic Objaverse and real-world RefNeRF demonstrate superior reconstruction fidelity and multi-view consistency, particularly for invisible regions, compared to prior methods.

Abstract

Generative 3D reconstruction shows strong potential in incomplete observations. While sparse-view and single-image reconstruction are well-researched, partial observation remains underexplored. In this context, dense views are accessible only from a specific angular range, with other perspectives remaining inaccessible. This task presents two main challenges: (i) limited View Range: observations confined to a narrow angular scope prevent effective traditional interpolation techniques that require evenly distributed perspectives. (ii) inconsistent Generation: views created for invisible regions often lack coherence with both visible regions and each other, compromising reconstruction consistency. To address these challenges, we propose \method, a novel training-free approach that integrates the local dense observations and multi-source priors for reconstruction. Our method introduces a fusion-based strategy to effectively align these priors in DDIM sampling, thereby generating multi-view consistent images to supervise invisible views. We further design an iterative refinement strategy, which uses the geometric structures of the object to enhance reconstruction quality. Extensive experiments on multiple datasets show the superiority of our method over SOTAs, especially in invisible regions.

Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object

TL;DR

Abstract

Zero-P-to-3: Zero-Shot Partial-View Images to 3D Object

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)