Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents

Malte Lehna; Jan Viebahn; Christoph Scholz; Antoine Marot; Sven Tomforde

Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents

Malte Lehna, Jan Viebahn, Christoph Scholz, Antoine Marot, Sven Tomforde

TL;DR

This article analyzes the submitted agent from Binbinchen and provides novel strategies to improve the agent, both for the RL and the rule-based approach, and observes that through the N-1 strategy, the actions of the agents become more diversified.

Abstract

The operation of electricity grids has become increasingly complex due to the current upheaval and the increase in renewable energy production. As a consequence, active grid management is reaching its limits with conventional approaches. In the context of the Learning to Run a Power Network challenge, it has been shown that Reinforcement Learning (RL) is an efficient and reliable approach with considerable potential for automatic grid operation. In this article, we analyse the submitted agent from Binbinchen and provide novel strategies to improve the agent, both for the RL and the rule-based approach. The main improvement is a N-1 strategy, where we consider topology actions that keep the grid stable, even if one line is disconnected. More, we also propose a topology reversion to the original grid, which proved to be beneficial. The improvements are tested against reference approaches on the challenge test sets and are able to increase the performance of the rule-based agent by 27%. In direct comparison between rule-based and RL agent we find similar performance. However, the RL agent has a clear computational advantage. We also analyse the behaviour in an exemplary case in more detail to provide additional insights. Here, we observe that through the N-1 strategy, the actions of the agents become more diversified.

Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents

TL;DR

Abstract

Paper Structure (27 sections, 8 figures, 5 tables, 1 algorithm)

This paper contains 27 sections, 8 figures, 5 tables, 1 algorithm.

Introduction
Overview
Research Contribution
Related Work
The Grid2Op Environment
The Teacher-Tutor-Junior-Senior Framework
Solution by Binbinchen
The Teacher
The Tutor
The Junior
The Senior
Methodological Improvements of the existing agent
N-1 Strategy Improvements
Topology Reversion Improvement
Code Improvements
...and 12 more sections

Figures (8)

Figure 1: Visualisation of an exemplary topology action, adapted from the paper of marot2020learning. The original grid shows an overload of the right line (in red) at time step $t$, due to an high demand from both load sinks. By executing a topology action, i.e., splitting the load flow of the substation into two separate notes, the bottom-right substation can divert the power and the grid returns to a more stable state without an overflow.
Figure 2: The electricity grid of the robustness track, based on a subset of the IEEE118 grid. In the grid, a total of 35 substations exist that are interconnected with power lines. The grid has both generators and load sinks in different parts of the grid. As original state, all power lines are connected to bus 1 on the substations. Through topology actions, these can be changed to bus 2. The figure was created with the internal plot method of Grid2Op.
Figure 3: Visualisation of the average survival time from the (blue), the (red), the (green), the (purple), the (orange) and the (turquoise). Each bar represents the average survival time of the agent in the respective scenario across all seeds. The overall average is reported in the legend.
Figure 4: Visualisation of the different topological actions of the agents , and . The actions are sorted according to the frequency of their substations. The most frequently used substation is at the top right, then all the other substations are ordered counterclockwise. The colour coding is the same for all three agents.
Figure 5: Display of the computation time for each action. The left graph shows the computation time, where each point is one action of the agent. The vertical lines correspond to a specific scenario. On the right, we aggregate the computation time across all scenarios in a rug plot. Note that for comparison, we only include the computation times if all three agents survived until the given time step.
...and 3 more figures

Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents

TL;DR

Abstract

Managing power grids through topology actions: A comparative study between advanced rule-based and reinforcement learning agents

Authors

TL;DR

Abstract

Table of Contents

Figures (8)