Table of Contents
Fetching ...

PowerModelsGAT-AI: Physics-Informed Graph Attention for Multi-System Power Flow with Continual Learning

Chidozie Ezeakunne, Jose E. Tabarez, Reeju Pokharel, Anup Pandey

Abstract

Solving the alternating current power flow equations in real time is essential for secure grid operation, yet classical Newton-Raphson solvers can be slow under stressed conditions. Existing graph neural networks for power flow are typically trained on a single system and often degrade on different systems. We present PowerModelsGAT-AI, a physics-informed graph attention network that predicts bus voltages and generator injections. The model uses bus-type-aware masking to handle different bus types and balances multiple loss terms, including a power-mismatch penalty, using learned weights. We evaluate the model on 14 benchmark systems (4 to 6,470 buses) and train a unified model on 13 of these under N-2 (two-branch outage) conditions, achieving an average normalized mean absolute error of 0.89% for voltage magnitudes and R^2 > 0.99 for voltage angles. We also show continual learning: when adapting a base model to a new 1,354-bus system, standard fine-tuning causes severe forgetting with error increases exceeding 1000% on base systems, while our experience replay and elastic weight consolidation strategy keeps error increases below 2% and in some cases improves base-system performance. Interpretability analysis shows that learned attention weights correlate with physical branch parameters (susceptance: r = 0.38; thermal limits: r = 0.22), and feature importance analysis supports that the model captures established power flow relationships.

PowerModelsGAT-AI: Physics-Informed Graph Attention for Multi-System Power Flow with Continual Learning

Abstract

Solving the alternating current power flow equations in real time is essential for secure grid operation, yet classical Newton-Raphson solvers can be slow under stressed conditions. Existing graph neural networks for power flow are typically trained on a single system and often degrade on different systems. We present PowerModelsGAT-AI, a physics-informed graph attention network that predicts bus voltages and generator injections. The model uses bus-type-aware masking to handle different bus types and balances multiple loss terms, including a power-mismatch penalty, using learned weights. We evaluate the model on 14 benchmark systems (4 to 6,470 buses) and train a unified model on 13 of these under N-2 (two-branch outage) conditions, achieving an average normalized mean absolute error of 0.89% for voltage magnitudes and R^2 > 0.99 for voltage angles. We also show continual learning: when adapting a base model to a new 1,354-bus system, standard fine-tuning causes severe forgetting with error increases exceeding 1000% on base systems, while our experience replay and elastic weight consolidation strategy keeps error increases below 2% and in some cases improves base-system performance. Interpretability analysis shows that learned attention weights correlate with physical branch parameters (susceptance: r = 0.38; thermal limits: r = 0.22), and feature importance analysis supports that the model captures established power flow relationships.
Paper Structure (40 sections, 21 equations, 5 figures, 14 tables)

This paper contains 40 sections, 21 equations, 5 figures, 14 tables.

Figures (5)

  • Figure 1: PowerModelsGAT-AI Architecture. The framework comprises (1) a bus-type-aware input stage with a supervision mask, (2) an encoder stack of 4 Pre-Norm Residual GATv2 blocks, (3) a shared decoding trunk, (4) multi-head outputs for $V_m,\,\delta,\,P_g,\,Q_g$, and (5) a physics-informed loss $\mathcal{L}_{\text{phy}}$. Red dashed lines indicate how predicted outputs and the supervision mask feed into the physics-informed loss $\mathcal{L}_{\text{phy}}$.
  • Figure 2: PowerModelsGAT-AI Performance across 13 Power Systems. Parity plots compare model predictions (y-axis) against the Newton--Raphson reference solution (x-axis) for all unknown variables in the unified 13-system training and test sets; color intensity represents point density.
  • Figure 3: Learned Branch Importance Maps. Darker edges indicate higher importance. (a) case14: the model assigns higher importance to key branches. (b) case57: importance scores vary across the system.
  • Figure 4: Analysis of Branch Importance Scores. (a) Correlation between branch importance and physical branch parameters. (b) Distribution of branch importance scores across benchmark systems.
  • Figure 5: Global Stability of Bus Feature Sensitivity. Boxplots show the distribution of normalized bus feature sensitivity scores across 13 systems for (a) $V_m$, (b) $\delta$, (c) $P_g$, and (d) $Q_g$.