Table of Contents
Fetching ...

Optimizing Efficiency of Mixed Traffic through Reinforcement Learning: A Topology-Independent Approach and Benchmark

Chuyang Xiao, Dawei Wang, Xinzheng Tang, Jia Pan, Yuexin Ma

TL;DR

This work tackles the problem of coordinating mixed traffic across diverse, unsignalized topologies by introducing a topology-agnostic, model-free RL framework trained in a centralized manner and executed decentrally. It develops a comprehensive real-world benchmark with 111 topologies and 444 dynamic scenarios across 20 countries, built in SUMO from OpenStreetMap data. The method relies on a SAC-based policy that maps local observations to continuous acceleration commands within $[-10,10]$ m/s$^2$, optimizing a composite reward that balances throughput, safety, and waiting time. Results show substantial improvements over traditional traffic signal baselines and state-of-the-art methods, especially at high RV penetration, and demonstrate the benchmark's potential to drive future research in real-world mixed-traffic control.

Abstract

This paper presents a mixed traffic control policy designed to optimize traffic efficiency across diverse road topologies, addressing issues of congestion prevalent in urban environments. A model-free reinforcement learning (RL) approach is developed to manage large-scale traffic flow, using data collected by autonomous vehicles to influence human-driven vehicles. A real-world mixed traffic control benchmark is also released, which includes 444 scenarios from 20 countries, representing a wide geographic distribution and covering a variety of scenarios and road topologies. This benchmark serves as a foundation for future research, providing a realistic simulation environment for the development of effective policies. Comprehensive experiments demonstrate the effectiveness and adaptability of the proposed method, achieving better performance than existing traffic control methods in both intersection and roundabout scenarios. To the best of our knowledge, this is the first project to introduce a real-world complex scenarios mixed traffic control benchmark. Videos and code of our work are available at https://sites.google.com/berkeley.edu/mixedtrafficplus/home

Optimizing Efficiency of Mixed Traffic through Reinforcement Learning: A Topology-Independent Approach and Benchmark

TL;DR

This work tackles the problem of coordinating mixed traffic across diverse, unsignalized topologies by introducing a topology-agnostic, model-free RL framework trained in a centralized manner and executed decentrally. It develops a comprehensive real-world benchmark with 111 topologies and 444 dynamic scenarios across 20 countries, built in SUMO from OpenStreetMap data. The method relies on a SAC-based policy that maps local observations to continuous acceleration commands within m/s, optimizing a composite reward that balances throughput, safety, and waiting time. Results show substantial improvements over traditional traffic signal baselines and state-of-the-art methods, especially at high RV penetration, and demonstrate the benchmark's potential to drive future research in real-world mixed-traffic control.

Abstract

This paper presents a mixed traffic control policy designed to optimize traffic efficiency across diverse road topologies, addressing issues of congestion prevalent in urban environments. A model-free reinforcement learning (RL) approach is developed to manage large-scale traffic flow, using data collected by autonomous vehicles to influence human-driven vehicles. A real-world mixed traffic control benchmark is also released, which includes 444 scenarios from 20 countries, representing a wide geographic distribution and covering a variety of scenarios and road topologies. This benchmark serves as a foundation for future research, providing a realistic simulation environment for the development of effective policies. Comprehensive experiments demonstrate the effectiveness and adaptability of the proposed method, achieving better performance than existing traffic control methods in both intersection and roundabout scenarios. To the best of our knowledge, this is the first project to introduce a real-world complex scenarios mixed traffic control benchmark. Videos and code of our work are available at https://sites.google.com/berkeley.edu/mixedtrafficplus/home

Paper Structure

This paper contains 18 sections, 4 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Benchmark Scenarios Visualization
  • Figure 2: The pipeline of our method is illustrated as follows. LEFT: The green car within the yellow square represents the ego robot vehicle (RV), which independently collects traffic information via its local perception system and autonomously decides its acceleration. RIGHT: The policy observation is depicted. The vehicles surrounding the ego RV may be either robot vehicles (RVs: Green) or human-driven vehicles (HVs: White), but they will all be considered in the observation of the ego. Our method accounts for vehicles positioned both in front of the ego RV, represented by the light green area, and behind it, represented by the light red area. For each observed vehicle, its relative velocity and position to the ego vehicle are encoded into the observation. Here, for example, the central RV will include the HV behind it, located within the red circle, and the RV in front of it, located within the green circle, as part of its observation.
  • Figure 3: Our scenario dataset is categorized by major topology types, intersections, and roundabouts. The scenarios are classified by the road topologies based on the number of road legs, the number of incoming lanes, and the number of outgoing lanes, to illustrate the distribution of the dataset.
  • Figure 4: An evaluation of our policy's performance, with training set sizes varying from 50 to 300 scenarios, was conducted based on two key metrics: average waiting time (s) and throughput rate ($10^-3$). The results clearly demonstrate improvements in both metrics with an increase in the size of the training set. The decreasing average waiting time and increasing throughput rate indicate that our policy becomes more effective with the enlargement of the training set.