Table of Contents
Fetching ...

Hierarchical Multi-Agent MCTS for Safety-Critical Coordination in Mixed-Autonomy Roundabouts

Zhihao Lin, Shuo Liu, Zhen Tian, Dezong Zhao, Jianglin Lan, Chongfeng Wei

TL;DR

This work tackles safety-critical coordination for mixed-autonomy traffic at unsignalized, dual-lane roundabouts by integrating a multi-agent Monte Carlo Tree Search with a hierarchical risk assessment. It jointly models CAV and HDV interactions as a multi-agent MDP, introduces lane-specific HDV uncertainty, and uses safety-aware pruning and a Shapley-value-based reward to balance individual and collective performance. The approach yields substantial safety and efficiency gains, reducing PET violations and trajectory deviations, especially as AV penetration increases; in fully autonomous scenarios PET violations vanish while mixed-traffic cases still achieve strong safety with high arrival rates. The framework offers a practical, interpretable planning mechanism for real-world deployment, with potential extensions in scalability and geometric generalization.

Abstract

Navigating unsignalized roundabouts in mixed-autonomy traffic presents significant challenges due to dense vehicle interactions, lane-changing complexities, and behavioral uncertainties of human-driven vehicles (HDVs). This paper proposes a safety-critical decision-making framework for connected and automated vehicles (CAVs) navigating dual-lane roundabouts alongside HDVs. We formulate the problem as a multi-agent Markov Decision Process and develop a hierarchical safety assessment mechanism that evaluates three critical interaction types: CAV-to-CAV (C2C), CAV-to-HDV (C2H), and CAV-to-Boundary (C2B). A key contribution is our lane-specific uncertainty model for HDVs, which captures distinct behavioral patterns between inner and outer lanes, with outer-lane vehicles exhibiting $2.3\times$ higher uncertainty due to less constrained movements. We integrate this safety framework with a multi-agent Monte Carlo Tree Search (MCTS) algorithm that employs safety-aware pruning to eliminate high-risk trajectories while maintaining computational efficiency. The reward function incorporates Shapley value-based credit assignment to balance individual performance with group coordination. Extensive simulation results validate the effectiveness of the proposed approach under both fully autonomous (100% AVs) and mixed traffic (50% AVs + 50% HDVs) conditions. Compared to benchmark methods, our framework consistently reduces trajectory deviations across all AVs and significantly lowers the rate of Post-Encroachment Time (PET) violations, achieving only 1.0% in the fully autonomous scenario and 3.2% in the mixed traffic setting.

Hierarchical Multi-Agent MCTS for Safety-Critical Coordination in Mixed-Autonomy Roundabouts

TL;DR

This work tackles safety-critical coordination for mixed-autonomy traffic at unsignalized, dual-lane roundabouts by integrating a multi-agent Monte Carlo Tree Search with a hierarchical risk assessment. It jointly models CAV and HDV interactions as a multi-agent MDP, introduces lane-specific HDV uncertainty, and uses safety-aware pruning and a Shapley-value-based reward to balance individual and collective performance. The approach yields substantial safety and efficiency gains, reducing PET violations and trajectory deviations, especially as AV penetration increases; in fully autonomous scenarios PET violations vanish while mixed-traffic cases still achieve strong safety with high arrival rates. The framework offers a practical, interpretable planning mechanism for real-world deployment, with potential extensions in scalability and geometric generalization.

Abstract

Navigating unsignalized roundabouts in mixed-autonomy traffic presents significant challenges due to dense vehicle interactions, lane-changing complexities, and behavioral uncertainties of human-driven vehicles (HDVs). This paper proposes a safety-critical decision-making framework for connected and automated vehicles (CAVs) navigating dual-lane roundabouts alongside HDVs. We formulate the problem as a multi-agent Markov Decision Process and develop a hierarchical safety assessment mechanism that evaluates three critical interaction types: CAV-to-CAV (C2C), CAV-to-HDV (C2H), and CAV-to-Boundary (C2B). A key contribution is our lane-specific uncertainty model for HDVs, which captures distinct behavioral patterns between inner and outer lanes, with outer-lane vehicles exhibiting higher uncertainty due to less constrained movements. We integrate this safety framework with a multi-agent Monte Carlo Tree Search (MCTS) algorithm that employs safety-aware pruning to eliminate high-risk trajectories while maintaining computational efficiency. The reward function incorporates Shapley value-based credit assignment to balance individual performance with group coordination. Extensive simulation results validate the effectiveness of the proposed approach under both fully autonomous (100% AVs) and mixed traffic (50% AVs + 50% HDVs) conditions. Compared to benchmark methods, our framework consistently reduces trajectory deviations across all AVs and significantly lowers the rate of Post-Encroachment Time (PET) violations, achieving only 1.0% in the fully autonomous scenario and 3.2% in the mixed traffic setting.

Paper Structure

This paper contains 13 sections, 31 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Overview of the proposed safety-critical decision making framework based on MCTS for roundabout navigation.
  • Figure 2: Illustration of the interaction scenario.
  • Figure 3: Safety-critical risk assessment. (a) Distance-based risk. (b) safety risk visualization.
  • Figure 4: Oscillating temporal uncertainty evolution showing the growth of prediction uncertainty over time for inner and outer lanes in both radial and angular dimensions.
  • Figure 5: Illustration of the safety-critical MCTS framework. Each node stores the visit count $N_n$, value estimate $Q_n$, and UCB score. Nodes identified as unsafe (shown in purple) are pruned during the safety validation stage. The green path indicates the backpropagation process.
  • ...and 5 more figures