InstructMPC: A Human-LLM-in-the-Loop Framework for Context-Aware Power Grid Control
Ruixiang Wu, Jiahao Ai, Tinko Sebastian Bartels, Tongxin Li
TL;DR
InstructMPC introduces a human-LLM-in-the-loop framework that makes MPC context-aware by translating in-context human instructions into predictive disturbances via a Contextual Disturbances Predictor. The method employs online last-layer adaptation to continually fine-tune predictions based on realized control costs, with theoretical guarantees including a regret bound of $O(\sqrt{T \log T})$ for linear dynamics and robustness to uninformative context. The paper formalizes the problem, provides a detailed CDP architecture, and proves consistency and robustness results, complemented by two numerical applications: drone-based power infrastructure inspection and battery management with SoC targets. The results demonstrate improved control performance and adaptability under dynamic grid conditions and human-instruction scenarios, highlighting the practical potential for context-aware grid operation.
Abstract
The transition toward power grids with high renewable penetration demands context-aware decision making frameworks. Traditional operational paradigms, which rely on static optimization of history-based load forecasting, often fail to capture the complex nature of real-time operational conditions, such as operator-issued maintenance mandates, emergency topology changes, or event-driven load surges. To address this challenge, we introduce InstructMPC, a closed-loop framework that integrates Large Language Models (LLMs) to generate context-aware predictions, enabling the controller to optimize power system operation. Our method employs a Contextual Disturbances Predictor (CDP) module to translate contextual information into predictive disturbance trajectories, which are then incorporated into the Model Predictive Control (MPC) optimization. Unlike conventional open-loop forecasting frameworks, InstructMPC features an online tuning mechanism where the predictor's parameters are continuously updated based on the realized control cost with a theoretical guarantee, achieving a regret bound of $O(\sqrt{T \log T})$ for linear dynamics when optimized via a tailored loss function, ensuring task-aware learning and adaption to non-stationary grid conditions.
