Systematic Benchmarking of SUMO Against Data-Driven Traffic Simulators
Erdao Liang
TL;DR
This work benchmarks a widely used model-based traffic simulator (SUMO) against state-of-the-art data-driven counterparts using large real-world datasets (WOMD/WOSAC). It introduces Waymo2SUMO, an automated pipeline that converts WOMD scenarios into SUMO road networks, traffic signals, and multi-agent dynamics, enabling scalable, closed-loop evaluation. Across short (8s) and long (60s) horizons, SUMO achieves a realism meta metric of 0.653 on WOSAC with far fewer tunable parameters than learning-based models, and demonstrates superior long-horizon stability with low collision and off-road rates. The study highlights complementary strengths of model-based and data-driven approaches and advocates hybrid, high-fidelity, efficient simulation for autonomous driving validation and development.
Abstract
This paper presents a systematic benchmarking of the model-based microscopic traffic simulator SUMO against state-of-the-art data-driven traffic simulators using large-scale real-world datasets. Using the Waymo Open Motion Dataset (WOMD) and the Waymo Open Sim Agents Challenge (WOSAC), we evaluate SUMO under both short-horizon (8s) and long-horizon (60s) closed-loop simulation settings. To enable scalable evaluation, we develop Waymo2SUMO, an automated pipeline that converts WOMD scenarios into SUMO simulations. On the WOSAC benchmark, SUMO achieves a realism meta metric of 0.653 while requiring fewer than 100 tunable parameters. Extended rollouts show that SUMO maintains low collision and offroad rates and exhibits stronger long-horizon stability than representative data-driven simulators. These results highlight complementary strengths of model-based and data-driven approaches for autonomous driving simulation and benchmarking.
