Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

Antonio Alcántara; Spyros Chatzivasileiadis

Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

Antonio Alcántara, Spyros Chatzivasileiadis

TL;DR

The paper tackles the challenge of trustworthy, real-time contingency analysis in power systems by turning GridFM outputs into statistically valid uncertainty intervals using a Stratified Conformal Prediction layer. It couples a physics-informed fine-tuning regime with a stratified calibration scheme that adapts prediction intervals to contingency severity and grid element, ensuring safe, AC-faithful state reconstruction. On IEEE-24 and IEEE-118 systems, the approach yields recall values exceeding $0.9$ for congestion screening, delivers up to $2$–$3$x precision gains over DC Power Flow, and achieves up to $18\times$ faster inference than AC Power Flow while providing full AC state. Importantly, the method generalizes to unseen high-order contingencies up to $N$-$5$ when trained on lower-order outages, highlighting practical impact for real-time grid security with quantified uncertainty.

Abstract

This work introduces for the first time, to our knowledge, a trustworthiness layer for foundation models in power systems. Using stratified conformal prediction, we devise adaptive, statistically valid confidence bounds for each output of a foundation model. For regression, this allows users to obtain an uncertainty estimate for each output; for screening, it supports conservative decisions that minimize false negatives. We demonstrate our method by enhancing GridFM, the first open-source Foundation Model for power systems, with statistically valid prediction intervals instead of heuristic error margins. We apply it for N-k contingency assessment, a combinatorial NP-Hard problem. We show that trustworthy GridFM can offer richer and more accurate information than DC Power Flow, having 2x-3x higher precision, while running up to 18x faster than AC Power Flow for systems up to 118 buses. Moving a step further, we also examine the ability of trustworthy GridFM to generalize to unseen high-order contingencies: through a rigorous analysis, we assess how a model trained on N-1 or N-2 outages extrapolates to unseen contingencies up to N-5.

Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

TL;DR

for congestion screening, delivers up to

–

x precision gains over DC Power Flow, and achieves up to

faster inference than AC Power Flow while providing full AC state. Importantly, the method generalizes to unseen high-order contingencies up to

when trained on lower-order outages, highlighting practical impact for real-time grid security with quantified uncertainty.

Abstract

Paper Structure (27 sections, 10 equations, 2 figures, 7 tables)

This paper contains 27 sections, 10 equations, 2 figures, 7 tables.

Introduction
Background and Related Work
Foundation Models for Power Systems
Machine Learning for Contingency Analysis
Uncertainty Quantification
Problem Formulation
State Prediction
Physics-Consistent Line Flow Calculation
Congestion Screening
The Standard Approach: DC Power Flow
Methodology: Calibrated and Reliable Foundation Models
The GridFM Architecture
Fine-tuning Strategy
Bounding with Conformal Prediction
General Formulation
...and 12 more sections

Figures (2)

Figure 1: Recall for line congestion screening under different contingency levels on IEEE-118 system. Results for the different GridFM models, with and without conformalization.
Figure 2: Precision–Recall trade-off for GridFM on the IEEE-118 system (overall metrics). Dots indicate point estimates; squares indicate 90%/95% conformal settings; the gray cross is the DCPF baseline. Lines connect each model family across settings.

Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

TL;DR

Abstract

Trustworthiness Layer for Foundation Models in Power Systems: Application for N-k Contingency Assessment

Authors

TL;DR

Abstract

Table of Contents

Figures (2)