LIVE-GS: LLM Powers Interactive VR by Enhancing Gaussian Splatting
Haotian Mao, Zhuoxiong Xu, Siyue Wei, Yule Quan, Nianchen Deng, Xubo Yang
TL;DR
This work tackles real-time, physically plausible interaction in radiance-field VR by integrating language-model-based scene understanding with 3D Gaussian Splatting. Specifically, it treats scenes as a collection of Gaussian kernels $G_k$ with center $\mathbf{p}_k$ and covariance $\Sigma_k$, and uses GPT-4o to extract object- and particle-level properties that drive a unified PBD-based interpolation for rigid, soft, and granular dynamics. A GPT-assisted GS inpainting and a feature-mask segmentation strategy fill unseen regions and precisely segment kernels for interaction targets. Real-time VR demos demonstrate complex behaviors (e.g., a doll-like wolf, a breaking mug, bouncing balls) without manual annotation, illustrating seamless coupling between scene understanding, rendering, and physics. This approach promises scalable, language-guided VR asset creation with realistic interactivity.
Abstract
Recently, radiance field rendering, such as 3D Gaussian Splatting (3DGS), has shown immense potential in VR content creation due to its high-quality rendering and efficient production process. However, existing physics-based interaction systems for 3DGS can only perform simple and non-realistic simulations or demand extensive user input for complex scenes, primarily due to the absence of scene understanding. In this paper, we propose LIVE-GS, a highly realistic interactive VR system powered by LLM. After object-aware GS reconstruction, we prompt GPT-4o to analyze the physical properties of objects in the scene, which are used to guide physical simulations consistent with real phenomena. We also design a GPT-assisted GS inpainting module to fill the unseen area covered by manipulative objects. To perform a precise segmentation of Gaussian kernels, we propose a feature-mask segmentation strategy. To enable rich interaction, we further propose a computationally efficient physical simulation framework through an PBD-based unified interpolation method, supporting various physical forms such as rigid body, soft body, and granular materials. Our experimental results show that with the help of LLM's understanding and enhancement of scenes, our VR system can support complex and realistic interactions without additional manual design and annotation.
