Table of Contents
Fetching ...

PowerChain: A Verifiable Agentic AI System for Automating Distribution Grid Analyses

Emmanuel O. Badmus, Peng Sang, Dimitrios Stamoulis, Amritanshu Pandey

TL;DR

PowerChain tackles the challenge of automating complex distribution grid analyses by enabling an agentic workflow that generalizes to unseen tasks. It achieves this by grounding LLM reasoning in domain-aware tool descriptors and a curated set of annotated, verified workflows, operating over DAG-structured pipelines and a two-role verifier–orchestrator framework. The approach yields significant performance gains over baselines, scales to larger toolsets with maintained accuracy, and reduces token cost without requiring LLM fine-tuning, offering a practical path to widespread, cost-efficient grid analytics. In real-world utility contexts, PowerChain promises faster, cheaper, and more reliable grid assessments for operators and planners alike, supporting electrification and decarbonization goals.

Abstract

Rapid electrification and decarbonization are increasing the complexity of distribution grid (DG) operation and planning, necessitating advanced computational analyses to ensure reliability and resilience. These analyses depend on disparate workflows comprising complex models, function calls, and data pipelines that require substantial expert knowledge and remain difficult to automate. Workforce and budget constraints further limit utilities' ability to apply such analyses at scale. To address this gap, we build an agentic system PowerChain, which is capable of autonomously performing complex grid analyses. Existing agentic AI systems are typically developed in a bottom-up manner with customized context for predefined analysis tasks; therefore, they do not generalize to tasks that the agent has never seen. In comparison, to generalize to unseen DG analysis tasks, PowerChain dynamically generates structured context by leveraging supervisory signals from self-contained power systems tools (e.g., GridLAB-D) and an optimized set of expert-annotated and verified reasoning trajectories. For complex DG tasks defined in natural language, empirical results on real utility data demonstrate that PowerChain achieves up to a 144/% improvement in performance over baselines.

PowerChain: A Verifiable Agentic AI System for Automating Distribution Grid Analyses

TL;DR

PowerChain tackles the challenge of automating complex distribution grid analyses by enabling an agentic workflow that generalizes to unseen tasks. It achieves this by grounding LLM reasoning in domain-aware tool descriptors and a curated set of annotated, verified workflows, operating over DAG-structured pipelines and a two-role verifier–orchestrator framework. The approach yields significant performance gains over baselines, scales to larger toolsets with maintained accuracy, and reduces token cost without requiring LLM fine-tuning, offering a practical path to widespread, cost-efficient grid analytics. In real-world utility contexts, PowerChain promises faster, cheaper, and more reliable grid assessments for operators and planners alike, supporting electrification and decarbonization goals.

Abstract

Rapid electrification and decarbonization are increasing the complexity of distribution grid (DG) operation and planning, necessitating advanced computational analyses to ensure reliability and resilience. These analyses depend on disparate workflows comprising complex models, function calls, and data pipelines that require substantial expert knowledge and remain difficult to automate. Workforce and budget constraints further limit utilities' ability to apply such analyses at scale. To address this gap, we build an agentic system PowerChain, which is capable of autonomously performing complex grid analyses. Existing agentic AI systems are typically developed in a bottom-up manner with customized context for predefined analysis tasks; therefore, they do not generalize to tasks that the agent has never seen. In comparison, to generalize to unseen DG analysis tasks, PowerChain dynamically generates structured context by leveraging supervisory signals from self-contained power systems tools (e.g., GridLAB-D) and an optimized set of expert-annotated and verified reasoning trajectories. For complex DG tasks defined in natural language, empirical results on real utility data demonstrate that PowerChain achieves up to a 144/% improvement in performance over baselines.

Paper Structure

This paper contains 31 sections, 14 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: PowerChain agentic workflow generation framework. User tasks are inputs to the orchestrator ($\mathcal{O}$), which builds prompts using annotated verified workflow-task pairs ($W_v$), tool descriptor ($\Delta$), and conversation history ($\mathcal{H}$). An LLM generates candidate workflow$w$ that relate elements in the tool set ($T$) and the utility database ($D$). The verifier ($\mathcal{V}$) tests workflow, with errors fed back to the verifier via an LLM until a valid workflow$\hat{w}$ is produced.
  • Figure 2: Example of a correct workflow $w$ for the query in Section \ref{['sec:PC_design']}. The workflow shows the ordered tool sequence that successfully executes the task without errors.
  • Figure 3: Workflow correction trace for the example query in Section \ref{['sec:PC_design']}. The DAG shows several iterations of PowerChain during the build–verify–feedback–correct process. Nodes marked “x” correspond to invalid tool calls detected in earlier iterations, while the connected valid nodes represent the final corrected workflow path.
  • Figure 4: Accuracy (pass@1 $\times$ Precision) against number of annotated verified workflow-task pairs ($X$) for proprietary models. Accuracy increases up to $X\!\approx\!10$ and then saturates.
  • Figure 5: Average tokens per pass@1 (Tk/P@1) across models and PowerChain modes.