Table of Contents
Fetching ...

ToF-Splatting: Dense SLAM using Sparse Time-of-Flight Depth and Multi-Frame Integration

Andrea Conti, Matteo Poggi, Valerio Cambareri, Martin R. Oswald, Stefano Mattoccia

TL;DR

ToF-Splatting tackles dense SLAM with sparse ToF sensors by integrating sparse depth, multi-view geometry, and monocular cues within a 3D Gaussian Splatting representation. The approach combines a tracking frontend, a multi-frame depth predictor, and a mapping backend to densify depth maps and build a coherent 3D model, achieving state-of-the-art tracking and mapping on real sparse-ToF datasets. It demonstrates robustness to depth sparsity and noise through multi-frame fusion and learned cues, though runtime remains a challenge for real-time deployment. Overall, the work enables dense SLAM on low-power ToF sensors and paves the way for efficient, high-quality scene reconstruction in mobile and AR/VR contexts.

Abstract

Time-of-Flight (ToF) sensors provide efficient active depth sensing at relatively low power budgets; among such designs, only very sparse measurements from low-resolution sensors are considered to meet the increasingly limited power constraints of mobile and AR/VR devices. However, such extreme sparsity levels limit the seamless usage of ToF depth in SLAM. In this work, we propose ToF-Splatting, the first 3D Gaussian Splatting-based SLAM pipeline tailored for using effectively very sparse ToF input data. Our approach improves upon the state of the art by introducing a multi-frame integration module, which produces dense depth maps by merging cues from extremely sparse ToF depth, monocular color, and multi-view geometry. Extensive experiments on both synthetic and real sparse ToF datasets demonstrate the viability of our approach, as it achieves state-of-the-art tracking and mapping performances on reference datasets.

ToF-Splatting: Dense SLAM using Sparse Time-of-Flight Depth and Multi-Frame Integration

TL;DR

ToF-Splatting tackles dense SLAM with sparse ToF sensors by integrating sparse depth, multi-view geometry, and monocular cues within a 3D Gaussian Splatting representation. The approach combines a tracking frontend, a multi-frame depth predictor, and a mapping backend to densify depth maps and build a coherent 3D model, achieving state-of-the-art tracking and mapping on real sparse-ToF datasets. It demonstrates robustness to depth sparsity and noise through multi-frame fusion and learned cues, though runtime remains a challenge for real-time deployment. Overall, the work enables dense SLAM on low-power ToF sensors and paves the way for efficient, high-quality scene reconstruction in mobile and AR/VR contexts.

Abstract

Time-of-Flight (ToF) sensors provide efficient active depth sensing at relatively low power budgets; among such designs, only very sparse measurements from low-resolution sensors are considered to meet the increasingly limited power constraints of mobile and AR/VR devices. However, such extreme sparsity levels limit the seamless usage of ToF depth in SLAM. In this work, we propose ToF-Splatting, the first 3D Gaussian Splatting-based SLAM pipeline tailored for using effectively very sparse ToF input data. Our approach improves upon the state of the art by introducing a multi-frame integration module, which produces dense depth maps by merging cues from extremely sparse ToF depth, monocular color, and multi-view geometry. Extensive experiments on both synthetic and real sparse ToF datasets demonstrate the viability of our approach, as it achieves state-of-the-art tracking and mapping performances on reference datasets.

Paper Structure

This paper contains 16 sections, 8 equations, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Overview of our ToF-Splatting method. Our method combines sparse ToF depth, multi-view geometry from a buffer of keyframes, and monocular cues (left) to perform into a unique end-to-end dense SLAM framework enabled by a Gaussian Splatting.
  • Figure 2: ToF-Splatting Pipeline. Our method involves three main modules: a Tracking frontend estimating camera poses, a Multi-Frame Integration module that predicts dense depth maps from sparse ToF measurements and multi-view geometry, and a Mapping backend modeling the 3D scene representation via 3D Gaussian Splatting.
  • Figure 3: Qualitative results on the ZJUL5 dataset tof-slam. We show meshes obtained by fusing rendered depth maps with TSDF and marching cubes (left), and 3D trajectories (right) on 3 scenes selected from the ZJUL5 dataset tof-slam.
  • Figure 4: Replica Qualitatives. We provide qualitative results on Replica replica19arxiv to demonstrate the generalization capabilities of our method. On the left from top to bottom, meshes obtained respectively on scenes Office2 and Room2. On the right, details from the scene Room0. ToF-Splatting delivers accurate details and allows for nice photometric and depth rendering.
  • Figure 5: Impact of depth sparsity. We test on Replica replica19arxiv with different simulated depth sparsity levels to assess the capability to exploit higher input densities. MAE and ATE smoothly decrease, whereas rendering metrics appear to be less affected.
  • ...and 5 more figures