Table of Contents
Fetching ...

Safe Imitation Learning-based Optimal Energy Storage Systems Dispatch in Distribution Networks

Shengren Hou, Peter Palensky, Pedro P. Vergara

TL;DR

Simulation results demonstrate the efficacy of Safe IRL in balancing operational efficiency and safety, eliminating voltage violations, and maintaining low operation cost errors across various network sizes, while meeting real-time execution requirements.

Abstract

The integration of distributed energy resources (DER) has escalated the challenge of voltage magnitude regulation in distribution networks. Traditional model-based approaches, which rely on complex sequential mathematical formulations, struggle to meet real-time operational demands. Deep reinforcement learning (DRL) offers a promising alternative by enabling offline training with distribution network simulators, followed by real-time execution. However, DRL algorithms tend to converge to local optima due to limited exploration efficiency. Additionally, DRL algorithms can not enforce voltage magnitude constraints, leading to potential operational violations when implemented in the distribution network operation. This study addresses these challenges by proposing a novel safe imitation reinforcement learning (IRL) framework that combines IRL and a designed safety layer, aiming to optimize the operation of Energy Storage Systems (ESSs) in active distribution networks. The proposed safe IRL framework comprises two phases: offline training and online execution. During the offline phase, optimal state-action pairs are collected using an NLP solver, guiding the IRL policy iteration. In the online phase, the trained IRL policy's decisions are adjusted by the safety layer to maintain safety and constraint compliance. Simulation results demonstrate the efficacy of Safe IRL in balancing operational efficiency and safety, eliminating voltage violations, and maintaining low operation cost errors across various network sizes, while meeting real-time execution requirements.

Safe Imitation Learning-based Optimal Energy Storage Systems Dispatch in Distribution Networks

TL;DR

Simulation results demonstrate the efficacy of Safe IRL in balancing operational efficiency and safety, eliminating voltage violations, and maintaining low operation cost errors across various network sizes, while meeting real-time execution requirements.

Abstract

The integration of distributed energy resources (DER) has escalated the challenge of voltage magnitude regulation in distribution networks. Traditional model-based approaches, which rely on complex sequential mathematical formulations, struggle to meet real-time operational demands. Deep reinforcement learning (DRL) offers a promising alternative by enabling offline training with distribution network simulators, followed by real-time execution. However, DRL algorithms tend to converge to local optima due to limited exploration efficiency. Additionally, DRL algorithms can not enforce voltage magnitude constraints, leading to potential operational violations when implemented in the distribution network operation. This study addresses these challenges by proposing a novel safe imitation reinforcement learning (IRL) framework that combines IRL and a designed safety layer, aiming to optimize the operation of Energy Storage Systems (ESSs) in active distribution networks. The proposed safe IRL framework comprises two phases: offline training and online execution. During the offline phase, optimal state-action pairs are collected using an NLP solver, guiding the IRL policy iteration. In the online phase, the trained IRL policy's decisions are adjusted by the safety layer to maintain safety and constraint compliance. Simulation results demonstrate the efficacy of Safe IRL in balancing operational efficiency and safety, eliminating voltage violations, and maintaining low operation cost errors across various network sizes, while meeting real-time execution requirements.

Paper Structure

This paper contains 16 sections, 19 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overall workflow of the proposed framework. The framework is composed of offline and online phases. The offline training is performed once, while the online operation is conducted at each time step $t$.
  • Figure 2: (a): Voltage magnitude for nodes in which the ESSs are connected, disregarding their operation. (b): Price in €/MWh. Voltage magnitude ((c), (e) (g)) in which the ESSs are connected and SOC of ESSs ((d), (f), (h)), after executing the dispatch decisions.
  • Figure 3: ESSs dispatch patterns between 12:00-16:00, conducted by the TD3BC and SafeTD3BC algorithms.