Learning to Change: Choreographing Mixed Traffic Through Lateral Control and Hierarchical Reinforcement Learning

Dawei Wang; Weizi Li; Lei Zhu; Jia Pan

Learning to Change: Choreographing Mixed Traffic Through Lateral Control and Hierarchical Reinforcement Learning

Dawei Wang, Weizi Li, Lei Zhu, Jia Pan

TL;DR

Coordinating mixed traffic at complex intersections with both robot and human-driven vehicles is challenging due to dynamic interactions and heterogeneous behaviors. The authors propose a hierarchical reinforcement learning framework that pairs a high-level Go/Stop decision module with a low-level PPO-based longitudinal and lateral RV controller, augmented by a safety post-processor. A combined optimization objective $L(\theta) = L^{PPO}(\theta) + L^{VF}(\theta)$ with clipping $\epsilon = 0.3$ guides policy learning, while real-world traffic data and a safety mechanism ensure robust, scalable performance. Experiments show up to a 54% reduction in average waiting time compared with a state-of-the-art baseline and superiority over traditional signal control when RV penetration exceeds 60%, indicating practical potential for large-scale mixed-traffic management at intersections.

Abstract

The management of mixed traffic that consists of robot vehicles (RVs) and human-driven vehicles (HVs) at complex intersections presents a multifaceted challenge. Traditional signal controls often struggle to adapt to dynamic traffic conditions and heterogeneous vehicle types. Recent advancements have turned to strategies based on reinforcement learning (RL), leveraging its model-free nature, real-time operation, and generalizability over different scenarios. We introduce a hierarchical RL framework to manage mixed traffic through precise longitudinal and lateral control of RVs. Our proposed hierarchical framework combines the state-of-the-art mixed traffic control algorithm as a high level decision maker to improve the performance and robustness of the whole system. Our experiments demonstrate that the framework can reduce the average waiting time by up to 54% compared to the state-of-the-art mixed traffic control method. When the RV penetration rate exceeds 60%, our technique consistently outperforms conventional traffic signal control programs in terms of the average waiting time for all vehicles at the intersection.

Learning to Change: Choreographing Mixed Traffic Through Lateral Control and Hierarchical Reinforcement Learning

TL;DR

with clipping

guides policy learning, while real-world traffic data and a safety mechanism ensure robust, scalable performance. Experiments show up to a 54% reduction in average waiting time compared with a state-of-the-art baseline and superiority over traditional signal control when RV penetration exceeds 60%, indicating practical potential for large-scale mixed-traffic management at intersections.

Abstract

Paper Structure (16 sections, 8 equations, 5 figures, 1 table)

This paper contains 16 sections, 8 equations, 5 figures, 1 table.

Introduction
Related Work
Mixed Traffic Control
Longitudinal and Lateral Planning of RVs
Methodology
Overview
Intersectional Traffic Flow
High-level Control Decisions
Low-level Longitudinal and Lateral Control
Safety Mechanism
Experiments and Results
Mixed Traffic Simulation
Baselines and Evaluation Metric
Intersection Performance
Analysis of Lane-changing Behaviors
...and 1 more sections

Figures (5)

Figure 1: Our framework starts with a perception system gathering both macroscopic and microscopic traffic conditions. Next, high-level decisions, Go/Stop, are made for the RVs. Subsequently, the framework generates low-level longitudinal and lateral control commands for the RVs. Lastly, a safety mechanism is deployed to resolve conflicting traffic streams and prevent vehicle collisions, ensuring safety in mixed traffic control at complex intersections.
Figure 2: Mixed traffic control at four real-world intersections situated in Colorado Springs, CO, USA, using actual traffic data sourced directly from these intersections. RVs are in red and HVs are in white. The RV penetration rate is 50%. Our framework enables efficient traffic flows at these intersections without the presence of traffic lights.
Figure 3: The overall results measured in average waiting time at four intersections between our technique and Wang et al. wang2023learning. The red line represents the average waiting time of traffic light control baseline (TL). Our method consistently outperforms the TL baseline when RV penetration rate reaches 60% or higher. Furthermore, in the majority of scenarios, our approach exhibits reduced average waiting times compared to Wang under the same RV penetration rates. For intersection 449, it is worth noting that the TL baseline has a high value of 45 seconds, thus it is excluded from the plot. This indicates that, at intersection 449, both our technique and Wang outperform TL under all tested RV penetration rates starting at 40%. For both intersection 332 and 334. our method consistently outperforms TL and Wang when the RV penetration rate $\ge 40\%$. In general, our technique also demonstrates much lower variance than Wang's, showing improved robustness and performance in mixed traffic control as a result of incorporating lateral and longitudinal control for the RVs.
Figure 4: Lane-changing behavior for mixed traffic control. An example shows an RV detecting an unregulated lane and subsequently switching to it in order to regulate it. Initially, with limited RV presence, HVs can exploit unregulated lanes, causing potential risk to the traffic within the intersection. This risk is resolved when an incoming RV detects it and decides to switch to the unregulated lane. As a result, a coordinated RV fleet forms to regulate southbound lanes, effectively mitigating conflicts and enhancing traffic efficiency within the intersection.
Figure 5: Comparison of the ratio of unregulated lanes between our method and Wang et al. wang2023learning. The RV penetration rate is 40%. The blue line denotes the percentage of unregulated lanes when our method is employed, while the orange line represents Wang's performance. Our method results in a faster reduction in unregulated lanes due to proactively lane-changing behaviors.

Learning to Change: Choreographing Mixed Traffic Through Lateral Control and Hierarchical Reinforcement Learning

TL;DR

Abstract

Learning to Change: Choreographing Mixed Traffic Through Lateral Control and Hierarchical Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)