HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2
Miriam Jäger, Theodor Kapler, Michael Feßenbecker, Felix Birkelbach, Markus Hillemann, Boris Jutzi
TL;DR
The paper introduces HoloGS, a workflow that enables instant 3D Gaussian Splatting using Microsoft HoloLens 2 data (RGB images, camera poses, and depth-derived point clouds) without traditional Structure from Motion preprocessing. It compares two initialization paths—external SfM data and internal HoloLens data—by training 3D Gaussians and densifying a point cloud, evaluated via PSNR and Chamfer Distance across two self-captured scenes. Results show that SfM initialization yields higher PSNR (e.g., Denker ~27.5, Ficus ~26.2) and better geometric accuracy (Chamfer ~0.02–0.05) than the on-device depth-based input (PSNR ~20.2–20.6, Chamfer ~0.30–0.60), with HoloLens reconstructions exhibiting artifacts and noise. The work demonstrates the feasibility of on-device, instant 3D reconstruction with potential for real-time mobile mapping, while outlining concrete avenues for improvement such as pose optimization during training and enhanced point-cloud extraction.
Abstract
In the fields of photogrammetry, computer vision and computer graphics, the task of neural 3D scene reconstruction has led to the exploration of various techniques. Among these, 3D Gaussian Splatting stands out for its explicit representation of scenes using 3D Gaussians, making it appealing for tasks like 3D point cloud extraction and surface reconstruction. Motivated by its potential, we address the domain of 3D scene reconstruction, aiming to leverage the capabilities of the Microsoft HoloLens 2 for instant 3D Gaussian Splatting. We present HoloGS, a novel workflow utilizing HoloLens sensor data, which bypasses the need for pre-processing steps like Structure from Motion by instantly accessing the required input data i.e. the images, camera poses and the point cloud from depth sensing. We provide comprehensive investigations, including the training process and the rendering quality, assessed through the Peak Signal-to-Noise Ratio, and the geometric 3D accuracy of the densified point cloud from Gaussian centers, measured by Chamfer Distance. We evaluate our approach on two self-captured scenes: An outdoor scene of a cultural heritage statue and an indoor scene of a fine-structured plant. Our results show that the HoloLens data, including RGB images, corresponding camera poses, and depth sensing based point clouds to initialize the Gaussians, are suitable as input for 3D Gaussian Splatting.
