Communication-Efficient MARL for Platoon Stability and Energy-efficiency Co-optimization in Cooperative Adaptive Cruise Control of CAVs
Min Hua, Dong Chen, Kun Jiang, Fanggang Zhang, Jinhai Wang, Bo Wang, Quan Zhou, Hongming Xu
TL;DR
This work tackles the joint problem of platoon stability and energy efficiency in cooperative adaptive cruise control (CACC) for connected and autonomous vehicles by developing a fully decentralized multi-agent reinforcement learning (MARL) framework. It introduces a communication-efficient design that integrates Quantized Stochastic Gradient Descent (QSGD) with Binary Differential Consensus (BDC) to reduce bandwidth while preserving learning quality in a decentralized setting. Compared with state-of-the-art MARL methods, the proposed BDC-MARL achieves up to 5.8% additional energy savings and demonstrates stable, safe platoon behavior across various scenarios, including OpenACC real-world data, with a 6-vehicle platoon identified as near-optimal for balancing energy efficiency and stability. The work provides detailed analyses of information-sharing strategies and platoon-size effects, showing practical viability for real-world deployment and offering a foundation for further improvements in communication protocols and long-horizon energy optimization. Overall, the approach advances energy-aware, scalable, and robust CACC in networked vehicle systems using decentralized MARL with efficient communications.
Abstract
Cooperative adaptive cruise control (CACC) has been recognized as a fundamental function of autonomous driving, in which platoon stability and energy efficiency are outstanding challenges that are difficult to accommodate in real-world operations. This paper studied the CACC of connected and autonomous vehicles (CAVs) based on the multi-agent reinforcement learning algorithm (MARL) to optimize platoon stability and energy efficiency simultaneously. The optimal use of communication bandwidth is the key to guaranteeing learning performance in real-world driving, and thus this paper proposes a communication-efficient MARL by incorporating the quantified stochastic gradient descent (QSGD) and a binary differential consensus (BDC) method into a fully-decentralized MARL framework. We benchmarked the performance of our proposed BDC-MARL algorithm against several several non-communicative andcommunicative MARL algorithms, e.g., IA2C, FPrint, and DIAL, through the evaluation of platoon stability, fuel economy, and driving comfort. Our results show that BDC-MARL achieved the highest energy savings, improving by up to 5.8%, with an average velocity of 15.26 m/s and an inter-vehicle spacing of 20.76 m. In addition, we conducted different information-sharing analyses to assess communication efficacy, along with sensitivity analyses and scalability tests with varying platoon sizes. The practical effectiveness of our approach is further demonstrated using real-world scenarios sourced from open-sourced OpenACC.
