VBR: A Vision Benchmark in Rome
Leonardo Brizi, Emanuele Giacomini, Luca Di Giammarino, Simone Ferrari, Omar Salem, Lorenzo De Rebotti, Giorgio Grisetti
TL;DR
VBR introduces a Rome-sourced, multi-sensor vision benchmark tailored for SLAM and odometry by providing six synchronized sequences acquired with handheld and car platforms. Ground truth is generated through a LiDAR Bundle Adjustment approach that fuses RTK-GPS priors with LiDAR odometry, achieving about $\pm 3\ \mathrm{cm}$ accuracy over long trajectories, and is validated with a Total Station. The dataset spans urban, garden, indoor, and highway-like scenes, totaling roughly $40\ \mathrm{km}$ of trajectories and $2\ \mathrm{TB}$ of raw data, with training/testing splits and a public evaluation server. Baseline experiments with KISS-ICP, F-LOAM, and ORB-SLAM3 illustrate the strengths of LiDAR-based methods and highlight the challenges of achieving precise global localization in diverse environments. This resource enables robust, fair benchmarking across robotic platforms (quadrupeds, quadrotors, autonomous vehicles) and supports future work in semantics and dense perception alongside odometry and SLAM evaluation.
Abstract
This paper presents a vision and perception research dataset collected in Rome, featuring RGB data, 3D point clouds, IMU, and GPS data. We introduce a new benchmark targeting visual odometry and SLAM, to advance the research in autonomous robotics and computer vision. This work complements existing datasets by simultaneously addressing several issues, such as environment diversity, motion patterns, and sensor frequency. It uses up-to-date devices and presents effective procedures to accurately calibrate the intrinsic and extrinsic of the sensors while addressing temporal synchronization. During recording, we cover multi-floor buildings, gardens, urban and highway scenarios. Combining handheld and car-based data collections, our setup can simulate any robot (quadrupeds, quadrotors, autonomous vehicles). The dataset includes an accurate 6-dof ground truth based on a novel methodology that refines the RTK-GPS estimate with LiDAR point clouds through Bundle Adjustment. All sequences divided in training and testing are accessible through our website.
