Table of Contents
Fetching ...

Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment

Mohamed Shamseldein

TL;DR

Grid-Mind is presented, a domain-specific LLM agent that interprets natural-language interconnection requests and autonomously orchestrates multi-fidelity power system simulations, establishing a reproducible baseline for continued refinement while maintaining auditable, simulation-grounded decision support.

Abstract

Large language models (LLMs) have demonstrated remarkable tool-use capabilities, yet their application to power system operations remains largely unexplored. This paper presents Grid-Mind, a domain-specific LLM agent that interprets natural-language interconnection requests and autonomously orchestrates multi-fidelity power system simulations. The LLM-first architecture positions the language model as the central decision-making entity, employing an eleven-tool registry to execute Connection Impact Assessment (CIA) studies spanning steadystate power flow, N-1 contingency analysis, transient stability, and electromagnetic transient screening. A violation inspector grounds every decision in quantitative simulation outputs, while a three-layer anti-hallucination defence mitigates numerical fabrication risk through forced capacity-tool routing and post-response grounding validation. A prompt-level self-correction mechanism extracts distilled lessons from agent failures, yielding progressive accuracy improvements without model retraining. End-to-end evaluation on 50 IEEE 118-bus scenarios (DeepSeek-V3, 2026-02-23) achieved 84.0% tool-selection accuracy and 100% parsing accuracy. A separate 56-scenario self-correction suite passed 49 of 56 cases (87.5%) with a mean score of 89.3. These results establish a reproducible baseline for continued refinement while maintaining auditable, simulation-grounded decision support.

Grid-Mind: An LLM-Orchestrated Multi-Fidelity Agent for Automated Connection Impact Assessment

TL;DR

Grid-Mind is presented, a domain-specific LLM agent that interprets natural-language interconnection requests and autonomously orchestrates multi-fidelity power system simulations, establishing a reproducible baseline for continued refinement while maintaining auditable, simulation-grounded decision support.

Abstract

Large language models (LLMs) have demonstrated remarkable tool-use capabilities, yet their application to power system operations remains largely unexplored. This paper presents Grid-Mind, a domain-specific LLM agent that interprets natural-language interconnection requests and autonomously orchestrates multi-fidelity power system simulations. The LLM-first architecture positions the language model as the central decision-making entity, employing an eleven-tool registry to execute Connection Impact Assessment (CIA) studies spanning steadystate power flow, N-1 contingency analysis, transient stability, and electromagnetic transient screening. A violation inspector grounds every decision in quantitative simulation outputs, while a three-layer anti-hallucination defence mitigates numerical fabrication risk through forced capacity-tool routing and post-response grounding validation. A prompt-level self-correction mechanism extracts distilled lessons from agent failures, yielding progressive accuracy improvements without model retraining. End-to-end evaluation on 50 IEEE 118-bus scenarios (DeepSeek-V3, 2026-02-23) achieved 84.0% tool-selection accuracy and 100% parsing accuracy. A separate 56-scenario self-correction suite passed 49 of 56 cases (87.5%) with a mean score of 89.3. These results establish a reproducible baseline for continued refinement while maintaining auditable, simulation-grounded decision support.
Paper Structure (43 sections, 7 equations, 3 figures, 12 tables, 1 algorithm)

This paper contains 43 sections, 7 equations, 3 figures, 12 tables, 1 algorithm.

Figures (3)

  • Figure 1: Grid-Mind architecture. The Agent Layer orchestrates multi-fidelity simulations through a solver-agnostic tool registry. Persistent memory stores study results across sessions. Dashed arrows indicate conditional escalation. The prompt-level lesson feedback loop injects distilled rules from failure analysis back into the agent's persistent memory. A health monitor provides real-time system status.
  • Figure 2: Multi-fidelity CIA pipeline. Stages are conditionally activated based on connection properties and violation severity.
  • Figure 3: Self-improving prompt-level lesson loop. Lessons from failure analysis are injected into the system prompt for subsequent iterations.