Table of Contents
Fetching ...

Pano2Room: Novel View Synthesis from a Single Indoor Panorama

Guo Pu, Yiming Zhao, Zhouhui Lian

TL;DR

Pano2Room tackles single-panorama indoor scene reconstruction by converting a 360° image into an initial mesh and iteratively refining it with a panoramic RGBD inpainter to fill occluded regions. The method culminates in a 3D Gaussian Splatting field trained on pseudo novel views, enabling photorealistic novel-view synthesis with detailed geometry. Key contributions include a Pano2Mesh module for accurate panorama-to-mesh conversion, a panorama-specific RGBD inpainter with stable-diffusion fine-tuning, an efficient camera-search and geometry-conflict strategy, and a Mesh2GS pipeline that yields 3D-consistent, high-fidelity renderings. Extensive experiments on replica and real-world panoramas demonstrate state-of-the-art performance in single-panorama indoor synthesis, with notable improvements in texture, geometry, and occlusion handling that broaden practical applicability.

Abstract

Recent single-view 3D generative methods have made significant advancements by leveraging knowledge distilled from extensive 3D object datasets. However, challenges persist in the synthesis of 3D scenes from a single view, primarily due to the complexity of real-world environments and the limited availability of high-quality prior resources. In this paper, we introduce a novel approach called Pano2Room, designed to automatically reconstruct high-quality 3D indoor scenes from a single panoramic image. These panoramic images can be easily generated using a panoramic RGBD inpainter from captures at a single location with any camera. The key idea is to initially construct a preliminary mesh from the input panorama, and iteratively refine this mesh using a panoramic RGBD inpainter while collecting photo-realistic 3D-consistent pseudo novel views. Finally, the refined mesh is converted into a 3D Gaussian Splatting field and trained with the collected pseudo novel views. This pipeline enables the reconstruction of real-world 3D scenes, even in the presence of large occlusions, and facilitates the synthesis of photo-realistic novel views with detailed geometry. Extensive qualitative and quantitative experiments have been conducted to validate the superiority of our method in single-panorama indoor novel synthesis compared to the state-of-the-art. Our code and data are available at \url{https://github.com/TrickyGo/Pano2Room}.

Pano2Room: Novel View Synthesis from a Single Indoor Panorama

TL;DR

Pano2Room tackles single-panorama indoor scene reconstruction by converting a 360° image into an initial mesh and iteratively refining it with a panoramic RGBD inpainter to fill occluded regions. The method culminates in a 3D Gaussian Splatting field trained on pseudo novel views, enabling photorealistic novel-view synthesis with detailed geometry. Key contributions include a Pano2Mesh module for accurate panorama-to-mesh conversion, a panorama-specific RGBD inpainter with stable-diffusion fine-tuning, an efficient camera-search and geometry-conflict strategy, and a Mesh2GS pipeline that yields 3D-consistent, high-fidelity renderings. Extensive experiments on replica and real-world panoramas demonstrate state-of-the-art performance in single-panorama indoor synthesis, with notable improvements in texture, geometry, and occlusion handling that broaden practical applicability.

Abstract

Recent single-view 3D generative methods have made significant advancements by leveraging knowledge distilled from extensive 3D object datasets. However, challenges persist in the synthesis of 3D scenes from a single view, primarily due to the complexity of real-world environments and the limited availability of high-quality prior resources. In this paper, we introduce a novel approach called Pano2Room, designed to automatically reconstruct high-quality 3D indoor scenes from a single panoramic image. These panoramic images can be easily generated using a panoramic RGBD inpainter from captures at a single location with any camera. The key idea is to initially construct a preliminary mesh from the input panorama, and iteratively refine this mesh using a panoramic RGBD inpainter while collecting photo-realistic 3D-consistent pseudo novel views. Finally, the refined mesh is converted into a 3D Gaussian Splatting field and trained with the collected pseudo novel views. This pipeline enables the reconstruction of real-world 3D scenes, even in the presence of large occlusions, and facilitates the synthesis of photo-realistic novel views with detailed geometry. Extensive qualitative and quantitative experiments have been conducted to validate the superiority of our method in single-panorama indoor novel synthesis compared to the state-of-the-art. Our code and data are available at \url{https://github.com/TrickyGo/Pano2Room}.
Paper Structure (24 sections, 11 equations, 8 figures, 1 table)

This paper contains 24 sections, 11 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: An overview of Pano2Room. With a panorama as input, we first predict the geometry of the panorama using the panoramic RGBD inpainter. Then we synthesize the initial mesh using a Pano2Mesh module. Next, we iteratively search for cameras with the least view completeness, and under the searched viewpoint, we render the existing mesh to obtain panoramic RGBDs with missing areas. To complete each rendered RGBD, we use the panoramic RGBD inpainter to generate new textures and predict new geometries. The new textures/geometries are iteratively fused into the existing mesh if no geometry conflict is introduced. Finally, the inpainted mesh is converted to a 3DGS and trained with collected pseudo novel views.
  • Figure 2: Demonstration of how to convert a panorama to a mesh. Initial mesh vertices and colors are derived from the input panorama’s pixels (depicted as black dots). Triangulation connects neighboring vertices to form faces (depicted by black lines). The edge map of the depth is utilized to disconnect faces representing different objects, ensuring accurate mesh generation.
  • Figure 3: Panoramic RGBD Inpainter. We first inpainted the rendered panoramic image using SDFT. Then, the depth of inpainted content is estimated by a pre-trained monocular depth predictor and seamlessly fused into the rendered panoramic depth, creating inpainted panoramic depth with detailed new geometry aligned with existing geometries and enforced surface normals.
  • Figure 4: SDFT: Fine-tuning Stable Diffusion on the input panorama to learn the styles and features.
  • Figure 5: The effectiveness of the proposed camera search strategy. The strategy with sequential camera trajectory leads to multiple inpainting steps performed in an occluded space and produces blurry inpainting results, while our method with searched viewpoints facilitates the generation of plausible new textures and geometry.
  • ...and 3 more figures