Table of Contents
Fetching ...

Self-Supervised Graph Neural Networks for Optimal Substation Reconfiguration

Antoine Martinez, Balthazar Donon, Louis Wehenkel, Efthymios Karangelos

Abstract

Changing the transmission system topology is an efficient and costless lever to reduce congestion or increase exchange capacities. The problem of finding the optimal switch states within substations is called Optimal Substation Reconfiguration (OSR), and may be framed as a Mixed Integer Linear Program (MILP). Current state-of-the-art optimization techniques come with prohibitive computing times, making them impractical for real-time decision-making. Meanwhile, deep learning offers a promising perspective with drastically smaller computing times, at the price of an expensive training phase and the absence of optimality guarantees. In this work, we frame OSR as an Amortized Optimization problem, where a Graph Neural Network (GNN) model -- our data being graphs -- is trained in a self-supervised way to improve the objective function. We apply our approach to the maximization of the exchange capacity between two areas of a small-scale 12-substations system. Once trained, our GNN model improves the exchange capacity by 10.2% on average compared to the all connected configuration, while a classical MILP solver reaches an average improvement of 15.2% with orders-of-magnitude larger computing times.

Self-Supervised Graph Neural Networks for Optimal Substation Reconfiguration

Abstract

Changing the transmission system topology is an efficient and costless lever to reduce congestion or increase exchange capacities. The problem of finding the optimal switch states within substations is called Optimal Substation Reconfiguration (OSR), and may be framed as a Mixed Integer Linear Program (MILP). Current state-of-the-art optimization techniques come with prohibitive computing times, making them impractical for real-time decision-making. Meanwhile, deep learning offers a promising perspective with drastically smaller computing times, at the price of an expensive training phase and the absence of optimality guarantees. In this work, we frame OSR as an Amortized Optimization problem, where a Graph Neural Network (GNN) model -- our data being graphs -- is trained in a self-supervised way to improve the objective function. We apply our approach to the maximization of the exchange capacity between two areas of a small-scale 12-substations system. Once trained, our GNN model improves the exchange capacity by 10.2% on average compared to the all connected configuration, while a classical MILP solver reaches an average improvement of 15.2% with orders-of-magnitude larger computing times.
Paper Structure (25 sections, 16 equations, 4 figures, 1 table, 1 algorithm)

This paper contains 25 sections, 16 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Twelve substations test case. Circles represent substations that are either of Type 1 (b) or of Type 2 (c), interconnected by double lines. The objective is to find which switches (in red) to open in order to maximize the transfer capacity from area $Z_1$ to area $Z_2$.
  • Figure 2: Impact of the decision variable $y$ over the topology of a toy example. For a simple base case (a) with Type 1 substations, $y=[1, 1, 1, 1]$ results in a four buses system (b), while $y=[1, 0, 1, 1]$ corresponds to a five buses topology (c).
  • Figure 3: Conversion of a two-dimensional combinatorial optimization problem into a continuous surrogate optimization problem. For a given context $x$, the decision variable $y = (y_1, y_2)$ is defined over a discrete set $\{0, 1\}^2$, as illustrated in (a). First, we introduce a continuous surrogate decision variable $z=(z_1, z_2)$ defined over $\mathbb{R}^2$. Then, we replace the initial objective function $f$ with a continuous surrogate objective function $f_\rho^\beta$ defined in equation \ref{['eq:surrogate_objective']} and illustrated in (b) and (c).
  • Figure 4: Score and behavior histograms of the different models along with the baselines. Histograms (a), (b), (c), (d) et (e) are computed over the whole Test set (one value per context). Histogram (f) is computed over the different switches (one value per switch). The scale of the vertical axis is linear, and is shared between histograms of the same row. In (b), (c), (d) and (f), Dirac peaks are cropped for the sake of readability.