LUMINA: Foundation Models for Topology Transferable ACOPF

Yijiang Li; Zeeshan Memon; Hongwei Jin; Stefano Fenu; Keunju Song; Sunash B Sharma; Parfait Gasana; Hongseok Kim; Liang Zhao; Kibaek Kim

LUMINA: Foundation Models for Topology Transferable ACOPF

Yijiang Li, Zeeshan Memon, Hongwei Jin, Stefano Fenu, Keunju Song, Sunash B Sharma, Parfait Gasana, Hongseok Kim, Liang Zhao, Kibaek Kim

TL;DR

This work derives design principles for constrained scientific foundation models through systematic investigation of AC optimal power flow (ACOPF), a representative optimization problem in power grid operations where power balance equations and operational constraints are non-negotiable.

Abstract

Foundation models in general promise to accelerate scientific computation by learning reusable representations across problem instances, yet constrained scientific systems, where predictions must satisfy physical laws and safety limits, pose unique challenges that stress conventional training paradigms. We derive design principles for constrained scientific foundation models through systematic investigation of AC optimal power flow (ACOPF), a representative optimization problem in power grid operations where power balance equations and operational constraints are non-negotiable. Through controlled experiments spanning architectures, training objectives, and system diversity, we extract three empirically grounded principles governing scientific foundation model design. These principles characterize three design trade-offs: learning physics-invariant representations while respecting system-specific constraints, optimizing accuracy while ensuring constraint satisfaction, and ensuring reliability in high-impact operating regimes. We present the LUMINA framework, including data processing and training pipelines to support reproducible research on physics-informed, feasibility-aware foundation models across scientific applications.

LUMINA: Foundation Models for Topology Transferable ACOPF

TL;DR

Abstract

Paper Structure (24 sections, 17 equations, 6 figures, 3 tables)

This paper contains 24 sections, 17 equations, 6 figures, 3 tables.

Introduction
Scientific Questions and Workflow Context
System Generalization
Feasibility and Reliability
Hard-Regime Behavior
Foundation Model Design Principles
Experimental Framework
Model Architectures.
Dataset.
Loss functions.
Evaluation Metrics.
Learning Across Systems via Multi-Topology Pretraining
Training Efficiency at Scale
Constraint-Aware Objectives for Actionable Reliability
Reliability Stress Tests: Extremes and Structural Hard Cases
...and 9 more sections

Figures (6)

Figure 1: Architecture comparison on single-topology (top) vs. multi-topology (bottom) pretraining. cases considered are case30, case57, and case118. Models trained with single topology are evaluated on the topology it is trained, while models trained jointly on multi-topology are evaluated on the each of the topology separately. Heterogeneous architectures generally demonstrate better performances than homogeneous models in both solution quality and constraint satisfaction, especially in multi-topology training.
Figure 2: Constraint violation convergence on case500: fine-tuning vs. training from scratch
Figure 3: Training time vs. case size with and without mixed precision training (BF16)
Figure 4: Loss function comparison across two selected architectures (one homogeneous and one heterogeneous) under equal training budgets. Top panels: single-topology pretraining. Bottom panels: multi-topology training jointly on case30, case57. and case118. Each architecture is paired with three loss functions (MSE, AL, VBL). Models trained with single topology are evaluated on the topology it is trained, while models trained jointly on multi-topology are evaluated on the each of the topology separately. Constraint-aware loss functions, AL and VBL, achieve signficantly better constraint violations compared to MSE while AL performs slightly better than VBL.
Figure 5: PCA components of activation for the top layer of convolutions in HGT trained on AL (left) vs MSE (right) losses. We see that both losses lead to the model capturing physical system load, but AL enforces a much more non-linear structure on the model's internal representation.
...and 1 more figures

LUMINA: Foundation Models for Topology Transferable ACOPF

TL;DR

Abstract

LUMINA: Foundation Models for Topology Transferable ACOPF

Authors

TL;DR

Abstract

Table of Contents

Figures (6)