Communication-Aware Reinforcement Learning for Cooperative Adaptive Cruise Control
Sicong Jiang, Seongjin Choi, Lijun Sun
TL;DR
CARL addresses scalability and robustness in multi-agent CACC by introducing a Communication-Aware Reinforcement Learning framework that uses V2V forward and backward information transfer to speed up cyclic information propagation while preserving a shared policy. The CA module processes high-dimensional communication data with neural networks and integrates with actor-critic algorithms (eg CA-DDPG, CA-TD3) to enable centralized training with decentralized execution. Evaluation on the NGSIM highway dataset shows that CARL improves headway, jerk, speed, TTC safety, and string stability, and generalizes across varying platoon sizes and unseen scenarios, outperforming IDM, Krauss, MADDPG, and standard DDPG/TD3 baselines. These results highlight CARL's potential to enhance safety, efficiency, and scalability in real-world CACC deployments and point to future work on adapting to different road conditions and broader driving tasks.
Abstract
Cooperative Adaptive Cruise Control (CACC) plays a pivotal role in enhancing traffic efficiency and safety in Connected and Autonomous Vehicles (CAVs). Reinforcement Learning (RL) has proven effective in optimizing complex decision-making processes in CACC, leading to improved system performance and adaptability. Among RL approaches, Multi-Agent Reinforcement Learning (MARL) has shown remarkable potential by enabling coordinated actions among multiple CAVs through Centralized Training with Decentralized Execution (CTDE). However, MARL often faces scalability issues, particularly when CACC vehicles suddenly join or leave the platoon, resulting in performance degradation. To address these challenges, we propose Communication-Aware Reinforcement Learning (CA-RL). CA-RL includes a communication-aware module that extracts and compresses vehicle communication information through forward and backward information transmission modules. This enables efficient cyclic information propagation within the CACC traffic flow, ensuring policy consistency and mitigating the scalability problems of MARL in CACC. Experimental results demonstrate that CA-RL significantly outperforms baseline methods in various traffic scenarios, achieving superior scalability, robustness, and overall system performance while maintaining reliable performance despite changes in the number of participating vehicles.
