Table of Contents
Fetching ...

Learning the Optimal Power Flow: Environment Design Matters

Thomas Wolgast, Astrid Nieße

TL;DR

This work collects and implements diverse environment design decisions from the literature regarding training data, observation space, episode definition, and reward function choice, and shows the significant impact of these environment design options on RL-OPF training performance.

Abstract

To solve the optimal power flow (OPF) problem, reinforcement learning (RL) emerges as a promising new approach. However, the RL-OPF literature is strongly divided regarding the exact formulation of the OPF problem as an RL environment. In this work, we collect and implement diverse environment design decisions from the literature regarding training data, observation space, episode definition, and reward function choice. In an experimental analysis, we show the significant impact of these environment design options on RL-OPF training performance. Further, we derive some first recommendations regarding the choice of these design decisions. The created environment framework is fully open-source and can serve as a benchmark for future research in the RL-OPF field.

Learning the Optimal Power Flow: Environment Design Matters

TL;DR

This work collects and implements diverse environment design decisions from the literature regarding training data, observation space, episode definition, and reward function choice, and shows the significant impact of these environment design options on RL-OPF training performance.

Abstract

To solve the optimal power flow (OPF) problem, reinforcement learning (RL) emerges as a promising new approach. However, the RL-OPF literature is strongly divided regarding the exact formulation of the OPF problem as an RL environment. In this work, we collect and implement diverse environment design decisions from the literature regarding training data, observation space, episode definition, and reward function choice. In an experimental analysis, we show the significant impact of these environment design options on RL-OPF training performance. Further, we derive some first recommendations regarding the choice of these design decisions. The created environment framework is fully open-source and can serve as a benchmark for future research in the RL-OPF field.
Paper Structure (37 sections, 23 equations, 7 figures, 2 tables)

This paper contains 37 sections, 23 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: The procedure and API of the developed RL-OPF environment framework, following the Gymnasium API.
  • Figure 2: VoltageControl: Scatter plot of normalized objective values and sum of violations.
  • Figure 3: EcoDispatch: Scatter plot of normalized objective values and sum of violations.
  • Figure 4: Training Data - Comparison of design options regarding optimization MAPE (first row), share of invalid solutions (second row), and variance in both the VoltageControl environment and the EcoDispatch environment (arranged in columns).
  • Figure 5: Observation Space - Comparison of design options regarding optimization MAPE (first row), share of invalid solutions (second row), and variance in both the VoltageControl environment and the EcoDispatch environment (arranged in columns).
  • ...and 2 more figures