Table of Contents
Fetching ...

Capture and Interact: Rapid 3D Object Acquisition and Rendering with Gaussian Splatting in Unity

Islomjon Shukhratov, Sergey Gorinsky

TL;DR

This work presents a cloud-based 3D Gaussian Splatting pipeline that converts smartphone video into a photorealistic, interactive 3D representation rendered in Unity. By combining SfM-derived poses with a 3D GS Gaussian model and a GPU-accelerated splatting renderer, the pipeline achieves rapid end-to-end processing (~10 minutes) and real-time client-side visualization (~150 fps) on commodity hardware. The method demonstrates photorealistic reconstruction for both simple and complex objects and enables near-instantaneous telepresence-style interaction for XR and collaboration. This approach lowers barriers to high-quality 3D content creation by offloading heavy computation to the cloud while keeping rendering responsive on the client, with potential applications in education, museums, and industry prototyping.

Abstract

Capturing and rendering three-dimensional (3D) objects in real time remain a significant challenge, yet hold substantial potential for applications in augmented reality, digital twin systems, remote collaboration and prototyping. We present an end-to-end pipeline that leverages 3D Gaussian Splatting (3D GS) to enable rapid acquisition and interactive rendering of real-world objects using a mobile device, cloud processing and a local computer. Users scan an object with a smartphone video, upload it for automated 3D reconstruction, and visualize it interactively in Unity at an average of 150 frames per second (fps) on a laptop. The system integrates mobile capture, cloud-based 3D GS and Unity rendering to support real-time telepresence. Our experiments show that the pipeline processes scans in approximately 10 minutes on a graphics processing unit (GPU) achieving real-time rendering on the laptop.

Capture and Interact: Rapid 3D Object Acquisition and Rendering with Gaussian Splatting in Unity

TL;DR

This work presents a cloud-based 3D Gaussian Splatting pipeline that converts smartphone video into a photorealistic, interactive 3D representation rendered in Unity. By combining SfM-derived poses with a 3D GS Gaussian model and a GPU-accelerated splatting renderer, the pipeline achieves rapid end-to-end processing (~10 minutes) and real-time client-side visualization (~150 fps) on commodity hardware. The method demonstrates photorealistic reconstruction for both simple and complex objects and enables near-instantaneous telepresence-style interaction for XR and collaboration. This approach lowers barriers to high-quality 3D content creation by offloading heavy computation to the cloud while keeping rendering responsive on the client, with potential applications in education, museums, and industry prototyping.

Abstract

Capturing and rendering three-dimensional (3D) objects in real time remain a significant challenge, yet hold substantial potential for applications in augmented reality, digital twin systems, remote collaboration and prototyping. We present an end-to-end pipeline that leverages 3D Gaussian Splatting (3D GS) to enable rapid acquisition and interactive rendering of real-world objects using a mobile device, cloud processing and a local computer. Users scan an object with a smartphone video, upload it for automated 3D reconstruction, and visualize it interactively in Unity at an average of 150 frames per second (fps) on a laptop. The system integrates mobile capture, cloud-based 3D GS and Unity rendering to support real-time telepresence. Our experiments show that the pipeline processes scans in approximately 10 minutes on a graphics processing unit (GPU) achieving real-time rendering on the laptop.

Paper Structure

This paper contains 5 sections, 2 figures.

Figures (2)

  • Figure 1: Web interface.
  • Figure 2: Examples of rendered objects.