Table of Contents
Fetching ...

HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2

Miriam Jäger, Theodor Kapler, Michael Feßenbecker, Felix Birkelbach, Markus Hillemann, Boris Jutzi

TL;DR

The paper introduces HoloGS, a workflow that enables instant 3D Gaussian Splatting using Microsoft HoloLens 2 data (RGB images, camera poses, and depth-derived point clouds) without traditional Structure from Motion preprocessing. It compares two initialization paths—external SfM data and internal HoloLens data—by training 3D Gaussians and densifying a point cloud, evaluated via PSNR and Chamfer Distance across two self-captured scenes. Results show that SfM initialization yields higher PSNR (e.g., Denker ~27.5, Ficus ~26.2) and better geometric accuracy (Chamfer ~0.02–0.05) than the on-device depth-based input (PSNR ~20.2–20.6, Chamfer ~0.30–0.60), with HoloLens reconstructions exhibiting artifacts and noise. The work demonstrates the feasibility of on-device, instant 3D reconstruction with potential for real-time mobile mapping, while outlining concrete avenues for improvement such as pose optimization during training and enhanced point-cloud extraction.

Abstract

In the fields of photogrammetry, computer vision and computer graphics, the task of neural 3D scene reconstruction has led to the exploration of various techniques. Among these, 3D Gaussian Splatting stands out for its explicit representation of scenes using 3D Gaussians, making it appealing for tasks like 3D point cloud extraction and surface reconstruction. Motivated by its potential, we address the domain of 3D scene reconstruction, aiming to leverage the capabilities of the Microsoft HoloLens 2 for instant 3D Gaussian Splatting. We present HoloGS, a novel workflow utilizing HoloLens sensor data, which bypasses the need for pre-processing steps like Structure from Motion by instantly accessing the required input data i.e. the images, camera poses and the point cloud from depth sensing. We provide comprehensive investigations, including the training process and the rendering quality, assessed through the Peak Signal-to-Noise Ratio, and the geometric 3D accuracy of the densified point cloud from Gaussian centers, measured by Chamfer Distance. We evaluate our approach on two self-captured scenes: An outdoor scene of a cultural heritage statue and an indoor scene of a fine-structured plant. Our results show that the HoloLens data, including RGB images, corresponding camera poses, and depth sensing based point clouds to initialize the Gaussians, are suitable as input for 3D Gaussian Splatting.

HoloGS: Instant Depth-based 3D Gaussian Splatting with Microsoft HoloLens 2

TL;DR

The paper introduces HoloGS, a workflow that enables instant 3D Gaussian Splatting using Microsoft HoloLens 2 data (RGB images, camera poses, and depth-derived point clouds) without traditional Structure from Motion preprocessing. It compares two initialization paths—external SfM data and internal HoloLens data—by training 3D Gaussians and densifying a point cloud, evaluated via PSNR and Chamfer Distance across two self-captured scenes. Results show that SfM initialization yields higher PSNR (e.g., Denker ~27.5, Ficus ~26.2) and better geometric accuracy (Chamfer ~0.02–0.05) than the on-device depth-based input (PSNR ~20.2–20.6, Chamfer ~0.30–0.60), with HoloLens reconstructions exhibiting artifacts and noise. The work demonstrates the feasibility of on-device, instant 3D reconstruction with potential for real-time mobile mapping, while outlining concrete avenues for improvement such as pose optimization during training and enhanced point-cloud extraction.

Abstract

In the fields of photogrammetry, computer vision and computer graphics, the task of neural 3D scene reconstruction has led to the exploration of various techniques. Among these, 3D Gaussian Splatting stands out for its explicit representation of scenes using 3D Gaussians, making it appealing for tasks like 3D point cloud extraction and surface reconstruction. Motivated by its potential, we address the domain of 3D scene reconstruction, aiming to leverage the capabilities of the Microsoft HoloLens 2 for instant 3D Gaussian Splatting. We present HoloGS, a novel workflow utilizing HoloLens sensor data, which bypasses the need for pre-processing steps like Structure from Motion by instantly accessing the required input data i.e. the images, camera poses and the point cloud from depth sensing. We provide comprehensive investigations, including the training process and the rendering quality, assessed through the Peak Signal-to-Noise Ratio, and the geometric 3D accuracy of the densified point cloud from Gaussian centers, measured by Chamfer Distance. We evaluate our approach on two self-captured scenes: An outdoor scene of a cultural heritage statue and an indoor scene of a fine-structured plant. Our results show that the HoloLens data, including RGB images, corresponding camera poses, and depth sensing based point clouds to initialize the Gaussians, are suitable as input for 3D Gaussian Splatting.
Paper Structure (15 sections, 1 equation, 9 figures, 2 tables)

This paper contains 15 sections, 1 equation, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Flowchart of HoloGS: Via the HoloLens 2 sensor streaming, the required data is directly extracted and processed during data capturing. From the depth data, a point cloud is instant created which, together with the RGB images and corresponding camera poses, is then fed into 3D Gaussian Splatting.
  • Figure 2: \ref{['fig:CapturingData']} Data capturing with Microsoft HoloLens 2 and its streaming application HoloLens2_Streaming. \ref{['fig:denker_poses']} Point cloud based on depth data of the scene 'Denker' and camera poses visualized by colored coordinate frames.
  • Figure 3: Initialization input 3D point clouds. \ref{['fig:homo_input_sfm']} sparse point cloud from SfM and \ref{['fig:homo_input_hololens']} point cloud calculated based on the depth images from HoloLens.
  • Figure 4: Comparison of the Peak Signal-to-Noise Ratio (PSNR) $\uparrow$ in and loss $\downarrow$ during the training processes with 30 000 iterations with 3D Gaussian Splatting with different types of input data. Top: \ref{['fig:denker']} external SfM and internal HoloLens data on scene 'Denker'. Bottom: \ref{['fig:ficus']} external SfM and internal HoloLens data on scene 'Ficus'. The red curves show the PSNR, the blue curves the training loss.
  • Figure 5: Rendered images. From left to right: \ref{['fig:denker_without_dark_colmap_']} external SfM data and \ref{['fig:denker_without_dark']} internal HoloLens data on scene 'Denker', as well as \ref{['fig:pflanze_colmap_']} external SfM data and \ref{['fig:pflanze']} internal HoloLens data on scene 'Ficus'.
  • ...and 4 more figures