A Generative Model Enhanced Multi-Agent Reinforcement Learning Method for Electric Vehicle Charging Navigation
Tianyang Qi, Shibo Chen, Jun Zhang
TL;DR
This work addresses the FCC-driven EV charging navigation problem under dynamic traffic and pricing by enabling decentralized decision-making based on local information. It introduces a generative model–enhanced multi-agent DRL framework that uses a CVAE-LSTM-based recommendation platform and a novel FCC-based encoder to provide advisory RI without accessing global state during execution. MGDA is employed to balance the DQN and CVAE losses, yielding stable training. Empirical results in a realistic Xi’an scenario show that the method achieves performance close to global-information baselines (within ~8%) while significantly outperforming other local-information methods, with improved training efficiency and scalability. The approach offers privacy-preserving, scalable navigation for large fleets of EVs in real-time charging networks.
Abstract
With the widespread adoption of electric vehicles (EVs), navigating for EV drivers to select a cost-effective charging station has become an important yet challenging issue due to dynamic traffic conditions, fluctuating electricity prices, and potential competition from other EVs. The state-of-the-art deep reinforcement learning (DRL) algorithms for solving this task still require global information about all EVs at the execution stage, which not only increases communication costs but also raises privacy issues among EV drivers. To overcome these drawbacks, we introduce a novel generative model-enhanced multi-agent DRL algorithm that utilizes only the EV's local information while achieving performance comparable to these state-of-the-art algorithms. Specifically, the policy network is implemented on the EV side, and a Conditional Variational Autoencoder-Long Short Term Memory (CVAE-LSTM)-based recommendation model is developed to provide recommendation information. Furthermore, a novel future charging competition encoder is designed to effectively compress global information, enhancing training performance. The multi-gradient descent algorithm (MGDA) is also utilized to adaptively balance the weight between the two parts of the training objective, resulting in a more stable training process. Simulations are conducted based on a practical area in Xián, China. Experimental results show that our proposed algorithm, which relies on local information, outperforms existing local information-based methods and achieves less than 8\% performance loss compared to global information-based methods.
