Table of Contents
Fetching ...

InstructMPC: A Human-LLM-in-the-Loop Framework for Context-Aware Power Grid Control

Ruixiang Wu, Jiahao Ai, Tinko Sebastian Bartels, Tongxin Li

TL;DR

InstructMPC introduces a human-LLM-in-the-loop framework that makes MPC context-aware by translating in-context human instructions into predictive disturbances via a Contextual Disturbances Predictor. The method employs online last-layer adaptation to continually fine-tune predictions based on realized control costs, with theoretical guarantees including a regret bound of $O(\sqrt{T \log T})$ for linear dynamics and robustness to uninformative context. The paper formalizes the problem, provides a detailed CDP architecture, and proves consistency and robustness results, complemented by two numerical applications: drone-based power infrastructure inspection and battery management with SoC targets. The results demonstrate improved control performance and adaptability under dynamic grid conditions and human-instruction scenarios, highlighting the practical potential for context-aware grid operation.

Abstract

The transition toward power grids with high renewable penetration demands context-aware decision making frameworks. Traditional operational paradigms, which rely on static optimization of history-based load forecasting, often fail to capture the complex nature of real-time operational conditions, such as operator-issued maintenance mandates, emergency topology changes, or event-driven load surges. To address this challenge, we introduce InstructMPC, a closed-loop framework that integrates Large Language Models (LLMs) to generate context-aware predictions, enabling the controller to optimize power system operation. Our method employs a Contextual Disturbances Predictor (CDP) module to translate contextual information into predictive disturbance trajectories, which are then incorporated into the Model Predictive Control (MPC) optimization. Unlike conventional open-loop forecasting frameworks, InstructMPC features an online tuning mechanism where the predictor's parameters are continuously updated based on the realized control cost with a theoretical guarantee, achieving a regret bound of $O(\sqrt{T \log T})$ for linear dynamics when optimized via a tailored loss function, ensuring task-aware learning and adaption to non-stationary grid conditions.

InstructMPC: A Human-LLM-in-the-Loop Framework for Context-Aware Power Grid Control

TL;DR

InstructMPC introduces a human-LLM-in-the-loop framework that makes MPC context-aware by translating in-context human instructions into predictive disturbances via a Contextual Disturbances Predictor. The method employs online last-layer adaptation to continually fine-tune predictions based on realized control costs, with theoretical guarantees including a regret bound of for linear dynamics and robustness to uninformative context. The paper formalizes the problem, provides a detailed CDP architecture, and proves consistency and robustness results, complemented by two numerical applications: drone-based power infrastructure inspection and battery management with SoC targets. The results demonstrate improved control performance and adaptability under dynamic grid conditions and human-instruction scenarios, highlighting the practical potential for context-aware grid operation.

Abstract

The transition toward power grids with high renewable penetration demands context-aware decision making frameworks. Traditional operational paradigms, which rely on static optimization of history-based load forecasting, often fail to capture the complex nature of real-time operational conditions, such as operator-issued maintenance mandates, emergency topology changes, or event-driven load surges. To address this challenge, we introduce InstructMPC, a closed-loop framework that integrates Large Language Models (LLMs) to generate context-aware predictions, enabling the controller to optimize power system operation. Our method employs a Contextual Disturbances Predictor (CDP) module to translate contextual information into predictive disturbance trajectories, which are then incorporated into the Model Predictive Control (MPC) optimization. Unlike conventional open-loop forecasting frameworks, InstructMPC features an online tuning mechanism where the predictor's parameters are continuously updated based on the realized control cost with a theoretical guarantee, achieving a regret bound of for linear dynamics when optimized via a tailored loss function, ensuring task-aware learning and adaption to non-stationary grid conditions.

Paper Structure

This paper contains 20 sections, 5 theorems, 41 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Theorem 4.1

Under Assumption asp:convex,asp:lg, if the learning rate $\eta_t$ is non-increasing, then Furthermore, if we choose $\eta_t=\frac{D}{G\sqrt{2(2k-1)(t+1)}}$, where $H$ is defined in Lemma lem:regret, and we define

Figures (4)

  • Figure 1: System framework of InstructMPC. The blue lines represent interactions between InstructMPC and the environment where InstructMPC receives the state $x_t$, and outputs the control input $u_t$. The black lines represents the information loop, within which external contextual information $c_{t: \mathcal{T} | t}$ is passed to the CDP to produce predicted disturbances $\hat{w}_{t:\mathcal{T}|t}$. Then, the MPC controller $\pi_{\mathrm{MPC}}$ utilizes $\hat{w}_{t:\mathcal{T}|t}$ and the current state $x_t$ to determine a control input $u_{t}^{\mathrm{MPC}}$ via \ref{['eq:mpc_formulation']}. After executing the MPC control input $u_{t}^{\mathrm{MPC}}$, the environment reveals the true disturbance $w_t$. The discrepancy between $w_t$ and $\hat{w}_t$ is then sent back to the CDP module as a loss signal.
  • Figure 2: Application 1: Power Infrastructure Inspection (Section \ref{['sec:drone_tracking']}). Performance comparison of predictors tuned with three different loss functions: 1. MSE Loss in \ref{['eq:MSE_loss']} (Red); 2. MAE Loss in \ref{['eq:MAE_Loss']} (Blue); 3. The proposed Special Loss tailored for the control task defined in Corollary \ref{['coro:final_bound']} (Green). (Left) The evolution of cumulative regret (top) and the convergence of the predictor parameters, $\theta_1$ (middle) and $\theta_2$ (bottom), over time. (Right) The corresponding average drone tracking trajectories projected onto the real-world power infrastructure map Garrett2024OpenInfra. For all plots, solid lines represent the mean over $50$ independent experiments, while the shaded regions denote the standard deviation.
  • Figure 3: Application 2: OpenCEM on-campus installation (Section \ref{['sec:soc_tracking']}).
  • Figure 4: Application 2: Battery Management with SoC Target (Section \ref{['sec:soc_tracking']}). Performance comparison and state norm for 1. InstructMPC, 2. classic contextual MPC, 3. MPC without contexts, 4. fixed average prediction, and 5. fixed zero prediction (state omitted). State truncated to 500 time steps for readability.

Theorems & Definitions (8)

  • Definition 1
  • Theorem 4.1
  • proof
  • Corollary 4.1
  • Theorem 4.2
  • Lemma 1: Lemma 13 in yu2022competitivecontroldelayedimperfect
  • Lemma 2
  • proof