Table of Contents
Fetching ...

SLAM&Render: A Benchmark for the Intersection Between Neural Rendering, Gaussian Splatting and SLAM

Samuel Cerezo, Gaetano Meli, Tomás Berriel Martins, Kirill Safronov, Javier Civera

TL;DR

SLAM&Render introduces a dedicated benchmark dataset that merges SLAM with neural rendering and Gaussian Splatting approaches. By recording 40 time-synchronized sequences on a robot manipulator with RGB-D, IMU, robot kinematics, and precise ground-truth poses under varied lighting and object rearrangements, it enables robust evaluation of multimodal fusion and generalization. Baseline experiments with Gaussian Splatting and FeatSplat reveal trajectory overfitting on independent test views and demonstrate potential gains from incorporating kinematic data for camera tracking, while naive seeds may reduce accuracy. The dataset therefore provides a practical platform for developing and assessing learning-based and hybrid SLAM/rendering methods with real robotic motions, fostering progress toward robust, multimodal perception, mapping, and rendering in realistic settings.

Abstract

Models and methods originally developed for Novel View Synthesis and Scene Rendering, such as Neural Radiance Fields (NeRF) and Gaussian Splatting, are increasingly being adopted as representations in Simultaneous Localization and Mapping (SLAM). However, existing datasets fail to include the specific challenges of both fields, such as sequential operations and, in many settings, multi-modality in SLAM or generalization across viewpoints and illumination conditions in neural rendering. Additionally, the data are often collected using sensors which are handheld or mounted on drones or mobile robots, which complicates the accurate reproduction of sensor motions. To bridge these gaps, we introduce SLAM&Render, a novel dataset designed to benchmark methods in the intersection between SLAM, Novel View Rendering and Gaussian Splatting. Recorded with a robot manipulator, it uniquely includes 40 sequences with time-synchronized RGB-D images, IMU readings, robot kinematic data, and ground-truth pose streams. By releasing robot kinematic data, the dataset also enables the assessment of recent integrations of SLAM paradigms within robotic applications. The dataset features five setups with consumer and industrial objects under four controlled lighting conditions, each with separate training and test trajectories. All sequences are static with different levels of object rearrangements and occlusions. Our experimental results, obtained with several baselines from the literature, validate SLAM&Render as a relevant benchmark for this emerging research area.

SLAM&Render: A Benchmark for the Intersection Between Neural Rendering, Gaussian Splatting and SLAM

TL;DR

SLAM&Render introduces a dedicated benchmark dataset that merges SLAM with neural rendering and Gaussian Splatting approaches. By recording 40 time-synchronized sequences on a robot manipulator with RGB-D, IMU, robot kinematics, and precise ground-truth poses under varied lighting and object rearrangements, it enables robust evaluation of multimodal fusion and generalization. Baseline experiments with Gaussian Splatting and FeatSplat reveal trajectory overfitting on independent test views and demonstrate potential gains from incorporating kinematic data for camera tracking, while naive seeds may reduce accuracy. The dataset therefore provides a practical platform for developing and assessing learning-based and hybrid SLAM/rendering methods with real robotic motions, fostering progress toward robust, multimodal perception, mapping, and rendering in realistic settings.

Abstract

Models and methods originally developed for Novel View Synthesis and Scene Rendering, such as Neural Radiance Fields (NeRF) and Gaussian Splatting, are increasingly being adopted as representations in Simultaneous Localization and Mapping (SLAM). However, existing datasets fail to include the specific challenges of both fields, such as sequential operations and, in many settings, multi-modality in SLAM or generalization across viewpoints and illumination conditions in neural rendering. Additionally, the data are often collected using sensors which are handheld or mounted on drones or mobile robots, which complicates the accurate reproduction of sensor motions. To bridge these gaps, we introduce SLAM&Render, a novel dataset designed to benchmark methods in the intersection between SLAM, Novel View Rendering and Gaussian Splatting. Recorded with a robot manipulator, it uniquely includes 40 sequences with time-synchronized RGB-D images, IMU readings, robot kinematic data, and ground-truth pose streams. By releasing robot kinematic data, the dataset also enables the assessment of recent integrations of SLAM paradigms within robotic applications. The dataset features five setups with consumer and industrial objects under four controlled lighting conditions, each with separate training and test trajectories. All sequences are static with different levels of object rearrangements and occlusions. Our experimental results, obtained with several baselines from the literature, validate SLAM&Render as a relevant benchmark for this emerging research area.

Paper Structure

This paper contains 19 sections, 1 equation, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Illustration of the capture setup for our SLAM&Render dataset, recorded from a Intel RealSense at the end effector of a robotic arm that moves around a set of objects on a table. See the four different light conditions present in our dataset: a) natural, b) cold, c) warm, and d) dark
  • Figure 2: Illustration of the objects included in SLAM&Render: a-b) supermarket goods, c) industrial goods, d) – h) object arrangements of the five setups.
  • Figure 3: Our SLAM&Render dataset contains train and test camera trajectories. From top left to bottom right: a) 2D representation of train trajectory, b) 3D representation of train trajectory, c) 2D representation of test trajectory, and d) 3D representation of test trajectory. Blue and yellow points represent start and end points of the trajectories, respectively.
  • Figure 4: Illustration of the reference frames involved in the data collection.
  • Figure 5: Validation of time-synchronized data by comparing angular velocity estimates from the MCS and the IMU gyroscope measurements.
  • ...and 3 more figures