Table of Contents
Fetching ...

URB -- Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles

Ahmet Onur Akman, Anastasia Psarou, Michał Hoffmann, Łukasz Gorczyca, Łukasz Kowalski, Paweł Gora, Grzegorz Jamróz, Rafał Kucharski

TL;DR

URB addresses the lack of standardized benchmarks for MARL in large-scale urban routing of mixed autonomous and human-driven traffic. It assembles 29 real-world networks, realistic demand patterns, baselines, and a modular MARL toolkit into a single benchmarking framework, and launches the first URB leaderboard. The study finds that state-of-the-art MARL methods often underperform humans in city-scale routing tasks, with high training costs and scalability challenges, underscoring the need for methodological advances. By enabling reproducible, diverse, and realistic experiments, URB aims to catalyze progress toward safe, efficient, and socially aware CAV routing in urban environments.

Abstract

Connected Autonomous Vehicles (CAVs) promise to reduce congestion in future urban networks, potentially by optimizing their routing decisions. Unlike for human drivers, these decisions can be made with collective, data-driven policies, developed using machine learning algorithms. Reinforcement learning (RL) can facilitate the development of such collective routing strategies, yet standardized and realistic benchmarks are missing. To that end, we present URB: Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles. URB is a comprehensive benchmarking environment that unifies evaluation across 29 real-world traffic networks paired with realistic demand patterns. URB comes with a catalog of predefined tasks, multi-agent RL (MARL) algorithm implementations, three baseline methods, domain-specific performance metrics, and a modular configuration scheme. Our results show that, despite the lengthy and costly training, state-of-the-art MARL algorithms rarely outperformed humans. The experimental results reported in this paper initiate the first leaderboard for MARL in large-scale urban routing optimization. They reveal that current approaches struggle to scale, emphasizing the urgent need for advancements in this domain.

URB -- Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles

TL;DR

URB addresses the lack of standardized benchmarks for MARL in large-scale urban routing of mixed autonomous and human-driven traffic. It assembles 29 real-world networks, realistic demand patterns, baselines, and a modular MARL toolkit into a single benchmarking framework, and launches the first URB leaderboard. The study finds that state-of-the-art MARL methods often underperform humans in city-scale routing tasks, with high training costs and scalability challenges, underscoring the need for methodological advances. By enabling reproducible, diverse, and realistic experiments, URB aims to catalyze progress toward safe, efficient, and socially aware CAV routing in urban environments.

Abstract

Connected Autonomous Vehicles (CAVs) promise to reduce congestion in future urban networks, potentially by optimizing their routing decisions. Unlike for human drivers, these decisions can be made with collective, data-driven policies, developed using machine learning algorithms. Reinforcement learning (RL) can facilitate the development of such collective routing strategies, yet standardized and realistic benchmarks are missing. To that end, we present URB: Urban Routing Benchmark for RL-equipped Connected Autonomous Vehicles. URB is a comprehensive benchmarking environment that unifies evaluation across 29 real-world traffic networks paired with realistic demand patterns. URB comes with a catalog of predefined tasks, multi-agent RL (MARL) algorithm implementations, three baseline methods, domain-specific performance metrics, and a modular configuration scheme. Our results show that, despite the lengthy and costly training, state-of-the-art MARL algorithms rarely outperformed humans. The experimental results reported in this paper initiate the first leaderboard for MARL in large-scale urban routing optimization. They reveal that current approaches struggle to scale, emphasizing the urgent need for advancements in this domain.

Paper Structure

This paper contains 45 sections, 6 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: URB is a comprehensive benchmarking framework for MARL methods in solving CAV routing tasks in a mixed urban traffic environment. It enables end-to-end assessment through a collection of 29 real-world traffic networks, realistic demand patterns, baseline methods, domain-specific performance indicators, and a flexible parameterization scheme.
  • Figure 2: The traffic networks used in our benchmarking study (Section \ref{['sec:sc1']}), shown in order of increasing demand levels: (a) St. Arnoult (small), (b) Provins (medium) and (c) Ingolstadt (large; from RESCO traffic light benchmark resco).
  • Figure 3: Mean travel times normalized by the pre-CAV mean human travel times ($t/t^{pre}$) across episodes in 3 instances for Scenario 1. Each plot visualizes the averages of five seeded repetitions, along with 95% confidence intervals. Smoothing was applied using a moving average of 150 episodes. Human travel times (orange dashed) report the mean human travel times averaged over all experiments in that instance. Background patches indicate phases: 200, 6 000, and 100 episodes (days simulated) for the human stabilization, CAV training, and policy testing, respectively. We conduct an additional training with QMIX for 20 000 training episodes (blue diamond). Many algorithms hardly beat the random baseline. Only QMIX on the smallest instance (St. Arnoult) managed to outperform humans, though not consistently, as indicated by the large variability across trials.
  • Figure 4: Screenshots from SUMO GUI, from an experiment conducted using the Provins traffic network. Yellow vehicles represent CAVs, while red vehicles indicate human drivers. Junctions are shown with dark gray, and yellow rectangles represent traffic detectors.
  • Figure 5: Routes generated for 4 selected origin-destination pairs in 3 different traffic networks used in our experiments (Ingolstadt (left), St. Arnoult (top right), Provins (bottom right)). Each shading color represents a different route.
  • ...and 4 more figures