Table of Contents
Fetching ...

MetaDecorator: Generating Immersive Virtual Tours through Multimodality

Shuang Xie, Yang Liu, Jeannie S. A. Lee, Haiwei Dong

TL;DR

MetaDecorator tackles the limitations of fixed 3D virtual tours by enabling personalized, text-guided decorating of 360° panoramas and subsequent geometry-aware 3D reconstruction. The method combines diffusion-based 2D decoration with ControlNet-guided geometry cues and a DP-NeRF pipeline to produce high-quality, render-efficient polygonal meshes suitable for VR and metaverse applications. A green AI enhancement via DP-NeRF accelerates training by leveraging a depth-informed occupancy grid and depth/RGB constraints, achieving ~10× faster training while maintaining competitive quality. The work also points to future directions involving LLM-driven user interaction and haptic textures to further enrich immersive experiences.

Abstract

MetaDecorator, is a framework that empowers users to personalize virtual spaces. By leveraging text-driven prompts and image synthesis techniques, MetaDecorator adorns static panoramas captured by 360° imaging devices, transforming them into uniquely styled and visually appealing environments. This significantly enhances the realism and engagement of virtual tours compared to traditional offerings. Beyond the core framework, we also discuss the integration of Large Language Models (LLMs) and haptics in the VR application to provide a more immersive experience.

MetaDecorator: Generating Immersive Virtual Tours through Multimodality

TL;DR

MetaDecorator tackles the limitations of fixed 3D virtual tours by enabling personalized, text-guided decorating of 360° panoramas and subsequent geometry-aware 3D reconstruction. The method combines diffusion-based 2D decoration with ControlNet-guided geometry cues and a DP-NeRF pipeline to produce high-quality, render-efficient polygonal meshes suitable for VR and metaverse applications. A green AI enhancement via DP-NeRF accelerates training by leveraging a depth-informed occupancy grid and depth/RGB constraints, achieving ~10× faster training while maintaining competitive quality. The work also points to future directions involving LLM-driven user interaction and haptic textures to further enrich immersive experiences.

Abstract

MetaDecorator, is a framework that empowers users to personalize virtual spaces. By leveraging text-driven prompts and image synthesis techniques, MetaDecorator adorns static panoramas captured by 360° imaging devices, transforming them into uniquely styled and visually appealing environments. This significantly enhances the realism and engagement of virtual tours compared to traditional offerings. Beyond the core framework, we also discuss the integration of Large Language Models (LLMs) and haptics in the VR application to provide a more immersive experience.

Paper Structure

This paper contains 9 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: The MetaDecorator Framework operates in two primary stages: 1) Image Decoration, where panoramic decorated images are generated based on guiding prompts; and 2) 3D Reconstruction, which produces realistic 3D representations from the decorated images. The framework outputs polygonal meshes, optimizing render speed for edge computing and facilitating integration with VR applications.
  • Figure 2: Examples of Panoramic Image Decoration: a) Original Image, b-e) Decorated Images. Panels b) and d) showcase Japanese-style decorations, while panels c) and e) feature Disney-style decorations. For panels d) and e), additional reference style images are provided at the bottom right of each respective image.
  • Figure 3: The training pipeline of DP-NeRF, optimized for efficiency through the following steps: a) Establishing occupancy based on depth prior; b) Tracing and training the NeRF model exclusively for occupied grids; c) Constraining appearance and geometry by integrating RGB and depth loss.