Multi-Objective Reinforcement Learning for Power Grid Topology Control

Thomas Lautenbacher; Ali Rajaei; Davide Barbieri; Jan Viebahn; Jochen L. Cremer

Multi-Objective Reinforcement Learning for Power Grid Topology Control

Thomas Lautenbacher, Ali Rajaei, Davide Barbieri, Jan Viebahn, Jochen L. Cremer

TL;DR

Transmission-grid congestion is mitigated by topology control, but multi-objective considerations require balancing grid security with asset wear and switching. The authors propose a MORL framework combining Deep Optimistic Linear Support (DOL) and Multi-Objective PPO (MOPPO) to generate a Pareto set of policies under three rewards: $R^L$, $R^D$, and $R^F$. They show that DOL yields a denser, more representative Pareto front than random sampling and that multi-objective policies improve robustness to N-1 contingencies and training efficiency on a 5-bus Grid2Op case study. The results suggest practical value for operators seeking multiple trade-offs in topology actions, and point to future work on larger grids and additional objectives.

Abstract

Transmission grid congestion increases as the electrification of various sectors requires transmitting more power. Topology control, through substation reconfiguration, can reduce congestion but its potential remains under-exploited in operations. A challenge is modeling the topology control problem to align well with the objectives and constraints of operators. Addressing this challenge, this paper investigates the application of multi-objective reinforcement learning (MORL) to integrate multiple conflicting objectives for power grid topology control. We develop a MORL approach using deep optimistic linear support (DOL) and multi-objective proximal policy optimization (MOPPO) to generate a set of Pareto-optimal policies that balance objectives such as minimizing line loading, topological deviation, and switching frequency. Initial case studies show that the MORL approach can provide valuable insights into objective trade-offs and improve Pareto front approximation compared to a random search baseline. The generated multi-objective RL policies are 30% more successful in preventing grid failure under contingencies and 20% more effective when training budget is reduced - compared to the common single objective RL policy.

Multi-Objective Reinforcement Learning for Power Grid Topology Control

TL;DR

Abstract

Multi-Objective Reinforcement Learning for Power Grid Topology Control

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)