Multilevel Graph Reinforcement Learning for Consistent Cognitive Decision-making in Heterogeneous Mixed Autonomy
Xin Gao, Zhaoyang Ma, Xueyuan Li, Xiaoqiang Meng, Zirui Li
TL;DR
The paper tackles decision-making for heterogeneous mixed autonomy by modeling spatiotemporal vehicle interactions in non-Euclidean spaces using a multilevel graph representation. It jointly leverages an asynchronous parallel hierarchical graph reinforcement learning framework (PAH-GRL) and a Multilevel Multi-head Graph Attention Network (ML-MGAT) to learn coordinated lane-changing and following policies across dimensions L and F. Key contributions include (i) a formal AMG-MDP formulation with dimension-specific state/action/reward definitions, (ii) dynamic, multidimensional, and weighted traffic graphs to capture complex interactions, and (iii) empirical demonstrations showing substantial improvements in task success, braking events, travel time, energy, and trajectory robustness on a high-density highway scenario. The approach advances scalable, human-like cognitive decision-making in CAVs with potential practical impact for safer, more efficient mixed-traffic systems.
Abstract
In the realm of heterogeneous mixed autonomy, vehicles experience dynamic spatial correlations and nonlinear temporal interactions in a complex, non-Euclidean space. These complexities pose significant challenges to traditional decision-making frameworks. Addressing this, we propose a hierarchical reinforcement learning framework integrated with multilevel graph representations, which effectively comprehends and models the spatiotemporal interactions among vehicles navigating through uncertain traffic conditions with varying decision-making systems. Rooted in multilevel graph representation theory, our approach encapsulates spatiotemporal relationships inherent in non-Euclidean spaces. A weighted graph represents spatiotemporal features between nodes, addressing the degree imbalance inherent in dynamic graphs. We integrate asynchronous parallel hierarchical reinforcement learning with a multilevel graph representation and a multi-head attention mechanism, which enables connected autonomous vehicles (CAVs) to exhibit capabilities akin to human cognition, facilitating consistent decision-making across various critical dimensions. The proposed decision-making strategy is validated in challenging environments characterized by high density, randomness, and dynamism on highway roads. We assess the performance of our framework through ablation studies, comparative analyses, and spatiotemporal trajectory evaluations. This study presents a quantitative analysis of decision-making mechanisms mirroring human cognitive functions in the realm of heterogeneous mixed autonomy, promoting the development of multi-dimensional decision-making strategies and a sophisticated distribution of attentional resources.
