SLAM&Render: A Benchmark for the Intersection Between Neural Rendering, Gaussian Splatting and SLAM
Samuel Cerezo, Gaetano Meli, Tomás Berriel Martins, Kirill Safronov, Javier Civera
TL;DR
SLAM&Render introduces a dedicated benchmark dataset that merges SLAM with neural rendering and Gaussian Splatting approaches. By recording 40 time-synchronized sequences on a robot manipulator with RGB-D, IMU, robot kinematics, and precise ground-truth poses under varied lighting and object rearrangements, it enables robust evaluation of multimodal fusion and generalization. Baseline experiments with Gaussian Splatting and FeatSplat reveal trajectory overfitting on independent test views and demonstrate potential gains from incorporating kinematic data for camera tracking, while naive seeds may reduce accuracy. The dataset therefore provides a practical platform for developing and assessing learning-based and hybrid SLAM/rendering methods with real robotic motions, fostering progress toward robust, multimodal perception, mapping, and rendering in realistic settings.
Abstract
Models and methods originally developed for Novel View Synthesis and Scene Rendering, such as Neural Radiance Fields (NeRF) and Gaussian Splatting, are increasingly being adopted as representations in Simultaneous Localization and Mapping (SLAM). However, existing datasets fail to include the specific challenges of both fields, such as sequential operations and, in many settings, multi-modality in SLAM or generalization across viewpoints and illumination conditions in neural rendering. Additionally, the data are often collected using sensors which are handheld or mounted on drones or mobile robots, which complicates the accurate reproduction of sensor motions. To bridge these gaps, we introduce SLAM&Render, a novel dataset designed to benchmark methods in the intersection between SLAM, Novel View Rendering and Gaussian Splatting. Recorded with a robot manipulator, it uniquely includes 40 sequences with time-synchronized RGB-D images, IMU readings, robot kinematic data, and ground-truth pose streams. By releasing robot kinematic data, the dataset also enables the assessment of recent integrations of SLAM paradigms within robotic applications. The dataset features five setups with consumer and industrial objects under four controlled lighting conditions, each with separate training and test trajectories. All sequences are static with different levels of object rearrangements and occlusions. Our experimental results, obtained with several baselines from the literature, validate SLAM&Render as a relevant benchmark for this emerging research area.
