Table of Contents
Fetching ...

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

Shuai Yang, Jing Tan, Mengchen Zhang, Tong Wu, Yixuan Li, Gordon Wetzstein, Ziwei Liu, Dahua Lin

TL;DR

LayerPano3D tackles the challenge of generating full-view, explorable 360° panoramas with consistent geometry by introducing a Layered 3D Panorama representation and lifting it into 3D Gaussians for large-scale navigation. It builds a high-quality upright panorama dataset, Upright360, and trains a panorama-tuned Flux model with LoRA to prevent semantic drift. The method decomposes reference panoramas into depth-based layers, completes unseen regions per layer, and aligns depths before converting to 3D Gaussians for interactive exploration, guided by a Gaussian selector to manage occlusion. Experimental results show state-of-the-art performance in both 2D panorama fidelity and 3D panoramic reconstruction, along with robust off-center rendering and efficient optimization, highlighting practical potential for immersive 3D scene creation from text prompts.

Abstract

3D immersive scene generation is a challenging yet critical task in computer vision and graphics. A desired virtual 3D scene should 1) exhibit omnidirectional view consistency, and 2) allow for free exploration in complex scene hierarchies. Existing methods either rely on successive scene expansion via inpainting or employ panorama representation to represent large FOV scene environments. However, the generated scene suffers from semantic drift during expansion and is unable to handle occlusion among scene hierarchies. To tackle these challenges, we introduce Layerpano3D, a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt. Our key insight is to decompose a reference 2D panorama into multiple layers at different depth levels, where each layer reveals the unseen space from the reference views via diffusion prior. Layerpano3D comprises multiple dedicated designs: 1) We introduce a new panorama dataset Upright360, comprising 9k high-quality and upright panorama images, and finetune the advanced Flux model on Upright360 for high-quality, upright and consistent panorama generation. 2) We pioneer the Layered 3D Panorama as underlying representation to manage complex scene hierarchies and lift it into 3D Gaussians to splat detailed 360-degree omnidirectional scenes with unconstrained viewing paths. Extensive experiments demonstrate that our framework generates state-of-the-art 3D panoramic scene in both full view consistency and immersive exploratory experience. We believe that Layerpano3D holds promise for advancing 3D panoramic scene creation with numerous applications.

LayerPano3D: Layered 3D Panorama for Hyper-Immersive Scene Generation

TL;DR

LayerPano3D tackles the challenge of generating full-view, explorable 360° panoramas with consistent geometry by introducing a Layered 3D Panorama representation and lifting it into 3D Gaussians for large-scale navigation. It builds a high-quality upright panorama dataset, Upright360, and trains a panorama-tuned Flux model with LoRA to prevent semantic drift. The method decomposes reference panoramas into depth-based layers, completes unseen regions per layer, and aligns depths before converting to 3D Gaussians for interactive exploration, guided by a Gaussian selector to manage occlusion. Experimental results show state-of-the-art performance in both 2D panorama fidelity and 3D panoramic reconstruction, along with robust off-center rendering and efficient optimization, highlighting practical potential for immersive 3D scene creation from text prompts.

Abstract

3D immersive scene generation is a challenging yet critical task in computer vision and graphics. A desired virtual 3D scene should 1) exhibit omnidirectional view consistency, and 2) allow for free exploration in complex scene hierarchies. Existing methods either rely on successive scene expansion via inpainting or employ panorama representation to represent large FOV scene environments. However, the generated scene suffers from semantic drift during expansion and is unable to handle occlusion among scene hierarchies. To tackle these challenges, we introduce Layerpano3D, a novel framework for full-view, explorable panoramic 3D scene generation from a single text prompt. Our key insight is to decompose a reference 2D panorama into multiple layers at different depth levels, where each layer reveals the unseen space from the reference views via diffusion prior. Layerpano3D comprises multiple dedicated designs: 1) We introduce a new panorama dataset Upright360, comprising 9k high-quality and upright panorama images, and finetune the advanced Flux model on Upright360 for high-quality, upright and consistent panorama generation. 2) We pioneer the Layered 3D Panorama as underlying representation to manage complex scene hierarchies and lift it into 3D Gaussians to splat detailed 360-degree omnidirectional scenes with unconstrained viewing paths. Extensive experiments demonstrate that our framework generates state-of-the-art 3D panoramic scene in both full view consistency and immersive exploratory experience. We believe that Layerpano3D holds promise for advancing 3D panoramic scene creation with numerous applications.
Paper Structure (32 sections, 2 equations, 10 figures, 4 tables)

This paper contains 32 sections, 2 equations, 10 figures, 4 tables.

Figures (10)

  • Figure 1: Pipeline Overview of LayerPano3D. Our framework consists of two stages, namely multi-layer panorama construction and panoramic 3D scene optimization. LayerPano3D streamlines an automatic generation pipeline without any manual efforts to design scene-specific navigation paths for expansion or completion.
  • Figure 2: Illustration of the Gaussian Selector. Given the new asset point cloud, the Gaussian Selector identifies the active Gaussians for next layer's optimization.
  • Figure 3: Qualitative comparisons in full 360°×180° Scene. We compare the panorama and multiple views of the scene generated by four methods. LayerPano3D exhibits consistent and rich details across full $360^\circ \times 180^\circ$ coverage, while other methods show obvious inconsistencies and disorganized patterns in regions that deviate from the input view.
  • Figure 4: Qualitative comparisons in Large-range Scene Exploration. We show the novel view renderings along a zigzag trajectory to compare the capability of large-range scene exploration. Our method is able to maintain high-quality content rendering and does not show distortion or gaps in unseen space, which shows the ability of LayerPano3D to create hyper-immersive panoramic scenes.
  • Figure 5: Qualitative comparisons in Panorama Generation.LayerPano3D demonstrates superior capability in generating high-quality outputs with precise alignment to text prompt, outperforming other methods in fidelity and input adherence.
  • ...and 5 more figures