Table of Contents
Fetching ...

ViFu: Multiple 360$^\circ$ Objects Reconstruction with Clean Background via Visible Part Fusion

Tianhan Xu, Takuya Ikeda, Koichi Nishiwaki

TL;DR

This paper proposes a method to segment and recover a static, clean background and multiple 360$^\circ$ objects from observations of scenes at different timestamps, and decomposes the multi-scene fusion task into two main components: objects/background segmentation and alignment and radiance fields fusion.

Abstract

In this paper, we propose a method to segment and recover a static, clean background and multiple 360$^\circ$ objects from observations of scenes at different timestamps. Recent works have used neural radiance fields to model 3D scenes and improved the quality of novel view synthesis, while few studies have focused on modeling the invisible or occluded parts of the training images. These under-reconstruction parts constrain both scene editing and rendering view selection, thereby limiting their utility for synthetic data generation for downstream tasks. Our basic idea is that, by observing the same set of objects in various arrangement, so that parts that are invisible in one scene may become visible in others. By fusing the visible parts from each scene, occlusion-free rendering of both background and foreground objects can be achieved. We decompose the multi-scene fusion task into two main components: (1) objects/background segmentation and alignment, where we leverage point cloud-based methods tailored to our novel problem formulation; (2) radiance fields fusion, where we introduce visibility field to quantify the visible information of radiance fields, and propose visibility-aware rendering for the fusion of series of scenes, ultimately obtaining clean background and 360$^\circ$ object rendering. Comprehensive experiments were conducted on synthetic and real datasets, and the results demonstrate the effectiveness of our method.

ViFu: Multiple 360$^\circ$ Objects Reconstruction with Clean Background via Visible Part Fusion

TL;DR

This paper proposes a method to segment and recover a static, clean background and multiple 360 objects from observations of scenes at different timestamps, and decomposes the multi-scene fusion task into two main components: objects/background segmentation and alignment and radiance fields fusion.

Abstract

In this paper, we propose a method to segment and recover a static, clean background and multiple 360 objects from observations of scenes at different timestamps. Recent works have used neural radiance fields to model 3D scenes and improved the quality of novel view synthesis, while few studies have focused on modeling the invisible or occluded parts of the training images. These under-reconstruction parts constrain both scene editing and rendering view selection, thereby limiting their utility for synthetic data generation for downstream tasks. Our basic idea is that, by observing the same set of objects in various arrangement, so that parts that are invisible in one scene may become visible in others. By fusing the visible parts from each scene, occlusion-free rendering of both background and foreground objects can be achieved. We decompose the multi-scene fusion task into two main components: (1) objects/background segmentation and alignment, where we leverage point cloud-based methods tailored to our novel problem formulation; (2) radiance fields fusion, where we introduce visibility field to quantify the visible information of radiance fields, and propose visibility-aware rendering for the fusion of series of scenes, ultimately obtaining clean background and 360 object rendering. Comprehensive experiments were conducted on synthetic and real datasets, and the results demonstrate the effectiveness of our method.
Paper Structure (32 sections, 8 equations, 10 figures, 1 algorithm)

This paper contains 32 sections, 8 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: An overview of our approach. (a,b) By capturing multi-view images of the scenes at different timestamps, ViFu recovers the appearance and 3D geometry of (d) clean static backgrounds and (e) multiple 360$^{\circ}$ foreground objects. (c) NeRF representation supports free-view rendering for clean backgrounds and multiple foreground objects, including their rearrangement, thereby (f) facilitating the datasets creation for downstream tasks.
  • Figure 2: The basic idea of ViFu. With pre-computed scene/objects alignment, we compare the visibility of the corresponding parts using the proposed visibility field, and fuse the higher visibility parts of each scene to form the clean background and multiple 360$^{\circ}$ objects. The details of visibility-aware rendering are shown in Fig. \ref{['fig:visibility_aware_rendering']}.
  • Figure 3: Illustration of visibility-aware rendering in 2D. The colors correspond to higher/lower visibility as shown in Fig. \ref{['fig:basic_idea']}.
  • Figure 4: Results on Blender synthetic datasets. For pairwise comparisons of foreground objects, the top-left image shows the rendering result of the proposed method, while the bottom-right image shows the reference image (ground truth).
  • Figure 5: Results on real capture datasets. (c) and (d) are obtained using the proposed method.
  • ...and 5 more figures