Table of Contents
Fetching ...

Online Prediction-Assisted Safe Reinforcement Learning for Electric Vehicle Charging Station Recommendation in Dynamically Coupled Transportation-Power Systems

Qionghua Liao, Guilong Li, Jiajie Yu, Ziyuan Gu, Wei Ma

TL;DR

This work formulates en-route EV charging station recommendation as a constrained Markov decision process to jointly optimize traffic efficiency and power grid safety in dynamically coupled transportation-power systems. It introduces Online Prediction-Assisted Safe Reinforcement Learning (OP-SRL), which leverages a Lagrangian-based PPO framework and an online Seq2Seq predictor to handle long-term constraints and delays between CS guidance and charging. Through extensive case studies on Nguyen-Dupuis network with IEEE 33-bus and a large real-world Kowloon network with IEEE 69-bus, OP-SRL consistently outperforms baselines in Total Travel Time $TTT$, Cumulative Voltage Violation $CVV$, and Waiting+Charging Time $WCT$, while demonstrating robustness to EV penetration, controller interval, and predictor design. The results underscore the value of system-level coupling, adaptive constraint handling, and forward-looking state augmentation for practical, scalable CS guidance in urban power-plus-transport infrastructure.

Abstract

With the proliferation of electric vehicles (EVs), the transportation network and power grid become increasingly interdependent and coupled via charging stations. The concomitant growth in charging demand has posed challenges for both networks, highlighting the importance of charging coordination. Existing literature largely overlooks the interactions between power grid security and traffic efficiency. In view of this, we study the en-route charging station (CS) recommendation problem for EVs in dynamically coupled transportation-power systems. The system-level objective is to maximize the overall traffic efficiency while ensuring the safety of the power grid. This problem is for the first time formulated as a constrained Markov decision process (CMDP), and an online prediction-assisted safe reinforcement learning (OP-SRL) method is proposed to learn the optimal and secure policy by extending the PPO method. To be specific, we mainly address two challenges. First, the constrained optimization problem is converted into an equivalent unconstrained optimization problem by applying the Lagrangian method. Second, to account for the uncertain long-time delay between performing CS recommendation and commencing charging, we put forward an online sequence-to-sequence (Seq2Seq) predictor for state augmentation to guide the agent in making forward-thinking decisions. Finally, we conduct comprehensive experimental studies based on the Nguyen-Dupuis network and a large-scale real-world road network, coupled with IEEE 33-bus and IEEE 69-bus distribution systems, respectively. Results demonstrate that the proposed method outperforms baselines in terms of road network efficiency, power grid safety, and EV user satisfaction. The case study on the real-world network also illustrates the applicability in the practical context.

Online Prediction-Assisted Safe Reinforcement Learning for Electric Vehicle Charging Station Recommendation in Dynamically Coupled Transportation-Power Systems

TL;DR

This work formulates en-route EV charging station recommendation as a constrained Markov decision process to jointly optimize traffic efficiency and power grid safety in dynamically coupled transportation-power systems. It introduces Online Prediction-Assisted Safe Reinforcement Learning (OP-SRL), which leverages a Lagrangian-based PPO framework and an online Seq2Seq predictor to handle long-term constraints and delays between CS guidance and charging. Through extensive case studies on Nguyen-Dupuis network with IEEE 33-bus and a large real-world Kowloon network with IEEE 69-bus, OP-SRL consistently outperforms baselines in Total Travel Time , Cumulative Voltage Violation , and Waiting+Charging Time , while demonstrating robustness to EV penetration, controller interval, and predictor design. The results underscore the value of system-level coupling, adaptive constraint handling, and forward-looking state augmentation for practical, scalable CS guidance in urban power-plus-transport infrastructure.

Abstract

With the proliferation of electric vehicles (EVs), the transportation network and power grid become increasingly interdependent and coupled via charging stations. The concomitant growth in charging demand has posed challenges for both networks, highlighting the importance of charging coordination. Existing literature largely overlooks the interactions between power grid security and traffic efficiency. In view of this, we study the en-route charging station (CS) recommendation problem for EVs in dynamically coupled transportation-power systems. The system-level objective is to maximize the overall traffic efficiency while ensuring the safety of the power grid. This problem is for the first time formulated as a constrained Markov decision process (CMDP), and an online prediction-assisted safe reinforcement learning (OP-SRL) method is proposed to learn the optimal and secure policy by extending the PPO method. To be specific, we mainly address two challenges. First, the constrained optimization problem is converted into an equivalent unconstrained optimization problem by applying the Lagrangian method. Second, to account for the uncertain long-time delay between performing CS recommendation and commencing charging, we put forward an online sequence-to-sequence (Seq2Seq) predictor for state augmentation to guide the agent in making forward-thinking decisions. Finally, we conduct comprehensive experimental studies based on the Nguyen-Dupuis network and a large-scale real-world road network, coupled with IEEE 33-bus and IEEE 69-bus distribution systems, respectively. Results demonstrate that the proposed method outperforms baselines in terms of road network efficiency, power grid safety, and EV user satisfaction. The case study on the real-world network also illustrates the applicability in the practical context.
Paper Structure (31 sections, 32 equations, 42 figures, 10 tables, 2 algorithms)

This paper contains 31 sections, 32 equations, 42 figures, 10 tables, 2 algorithms.

Figures (42)

  • Figure 1: Illustration of the synergy in the dynamically coupled transportation-power systems. (1) The left diagram illustrates the coupling relationships between UTN and PDN through EVs, CSs, and buses, as well as the information flow among various components in the coupled systems. (2) The diagram on the right explains the working mechanism of the charge controller and its impact on traffic efficiency.
  • Figure 2: The correspondence diagram of trip activities, timeline, and state of charge (SoC) change for exemplary EV $i$ ($i \in \mathcal{I}^\mathrm{EV}$) with charging request. (1) The activities chain in the middle illustrates the sequence of events of EV $i$ from the starting point to the trip destination, including the driving process from origin to target CS, the queuing process (if present), the charging process, and the driving process to the destination. (2) The timeline chain at the bottom shows the start time (inside the boxes) and duration (between two boxes) of each activity using variable representations. (3) The SoC chain at the top represents the SoC change at each key time point (in the boxes) and the variations in battery energy (between two boxes).
  • Figure 3: A schematic overview of the proposed Online Prediction-Assisted SRL (OP-SRL) method.
  • Figure 4: Schematic diagram of the integration between the Nguyen-Dupuis transportation network and the IEEE 33-bus distribution system.
  • Figure 6: Time-varying vehicle count at each charging station (CS) as a result of the proposed method. The vehicle count here comprises both queued vehicles and charging vehicles. The black dashed line represents the service capacity of each charging station, i.e., the number of charging piles.
  • ...and 37 more figures