Preventive Energy Management for Distribution Systems Under Uncertain Events: A Deep Reinforcement Learning Approach
Md Isfakul Anam, Tuyen Vu, Jianhua Zhang
TL;DR
The paper addresses distribution-system resilience under uncertain events by integrating component PoFs into a CVaR-based optimization framework for preventive EMS. It then solves the resulting stochastic problem with a PPO-based DRL agent, framing the EMS as an MDP and designing a reward structure to enforce operational constraints while prioritizing critical loads. Key contributions include introducing a CVaR reformulation for scalable scenario handling, applying PPO to a power-system context, and demonstrating resilience gains on a notional MVDC ship system and the IEEE 30-bus network. The results indicate that the DRL approach offers rapid decision-making and adaptability to future uncertainties, outperforming traditional optimization in terms of flexibility and speed, with potential for extension to multi-agent and graph-based RL architectures.
Abstract
As power systems become more complex with the continuous integration of intelligent distributed energy resources (DERs), new risks and uncertainties arise. Consequently, to enhance system resiliency, it is essential to account for various uncertain events when implementing the optimization problem for the energy management system (EMS). This paper presents a preventive EMS considering the probability of failure (PoF) of each system component across different scenarios. A conditional-value-at-risk (CVaR)-based framework is proposed to integrate the uncertainties of the distribution network. Loads are classified into critical, semi-critical, and non-critical categories to prioritize essential loads during generation resource shortages. A proximal policy optimization (PPO)-based reinforcement learning (RL) agent is used to solve the formulated problem and generate the control decisions. The proposed framework is evaluated on a notional MVDC ship system and a modified IEEE 30-bus system, where the results demonstrate that the PPO agent can successfully optimize the objective function while maintaining the network and operational constraints. For validation, the RL-based method is benchmarked against a traditional optimization approach, further highlighting its effectiveness and robustness. This comparison shows that RL agents can offer more resiliency against future uncertain events compared to the traditional solution methods due to their adaptability and learning capacity.
