MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making

Yicheng Guo; Jiaqi Liu; Rongjie Yu; Peng Hang; Jian Sun

MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making

Yicheng Guo, Jiaqi Liu, Rongjie Yu, Peng Hang, Jian Sun

TL;DR

MAPPO-PIS addresses cooperative decision-making for CAVs in merging areas under human–machine mixed traffic by extending MAPPO with an Intention Generator Module (IGM) that generates multi-step future trajectories and a Safety Enhanced Module (SEM) that detects and corrects unsafe intents. Integrated within a centralized training and distributed execution MARL framework, MAPPO-PIS demonstrates improved safety and efficiency over baselines in diverse traffic densities and heterogeneous vehicle settings, aided by curriculum learning and ablation validation. Key findings show reduced collision rates, higher average speeds, and more stable learning curves, with macro analyses indicating delayed bottleneck breakdown and faster recovery in merging flows. The work highlights explicit intent sharing combined with safety-aware corrections as a practical approach to enhance real-world CAV merging performance.

Abstract

Vehicle-to-Vehicle (V2V) technologies have great potential for enhancing traffic flow efficiency and safety. However, cooperative decision-making in multi-agent systems, particularly in complex human-machine mixed merging areas, remains challenging for connected and autonomous vehicles (CAVs). Intent sharing, a key aspect of human coordination, may offer an effective solution to these decision-making problems, but its application in CAVs is under-explored. This paper presents an intent-sharing-based cooperative method, the Multi-Agent Proximal Policy Optimization with Prior Intent Sharing (MAPPO-PIS), which models the CAV cooperative decision-making problem as a Multi-Agent Reinforcement Learning (MARL) problem. It involves training and updating the agents' policies through the integration of two key modules: the Intention Generator Module (IGM) and the Safety Enhanced Module (SEM). The IGM is specifically crafted to generate and disseminate CAVs' intended trajectories spanning multiple future time-steps. On the other hand, the SEM serves a crucial role in assessing the safety of the decisions made and rectifying them if necessary. Merging area with human-machine mixed traffic flow is selected to validate our method. Results show that MAPPO-PIS significantly improves decision-making performance in multi-agent systems, surpassing state-of-the-art baselines in safety, efficiency, and overall traffic system performance. The code and video demo can be found at: \url{https://github.com/CCCC1dhcgd/A-MAPPO-PIS}.

MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making

TL;DR

Abstract

Paper Structure (19 sections, 12 equations, 10 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 12 equations, 10 figures, 2 tables, 1 algorithm.

Introduction
Related work
CAVs Cooperation
Intent and intent sharing
Multi-Agent Reinforcement Learning (MARL)
Methodology
Problem Formulation
Intention Generator Module (IGM)
Safety Enhanced Module (SEM)
Update and Optimization
Experiment Results and Analysis
Experimental Setups
Curriculum Learning
Ablation Study
Algorithm Comparison
...and 4 more sections

Figures (10)

Figure 1: Illustration of merging area with human-machine mixed traffic flow.
Figure 2: The overview of MAPPO-PIS architecture
Figure 3: The framework of the Multi-Agent Proximal Policy Optimization (MAPPO).
Figure 4: Performance of MAPPO-PIS with and without curriculum learning for Hard Mode, and the seed is set to 0.
Figure 5: Training average reward curves under different levels of traffic scenario. The shadow region of curves is the confidence interval within the standard deviation.
...and 5 more figures

MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making

TL;DR

Abstract

MAPPO-PIS: A Multi-Agent Proximal Policy Optimization Method with Prior Intent Sharing for CAVs' Cooperative Decision-Making

Authors

TL;DR

Abstract

Table of Contents

Figures (10)