Imitation Learning for Intra-Day Power Grid Operation through Topology Actions
Matthijs de Jong, Jan Viebahn, Yuliya Shapovalova
TL;DR
This work studies imitation learning for intra-day topology-control in power grids, training a fully-connected neural network (FCNN) on state–action pairs from two rule-based experts (Greedy and N-1) within Grid2Op's IEEE 14-bus setup. While IL accuracy is limited by class imbalance and overlap, IL-enabled agents—especially when augmented with minimal simulations as hybrids—achieve near-expert performance with orders-of-magnitude faster inference, across full-network and outage regimes. The results demonstrate the viability of fast, high-performing topology-control agents and highlight the potential benefits of hybrid IL approaches, while identifying distribution-shift and dataset bias as key challenges for future work. The study also emphasizes the importance of integrating simulation and robust action-selection strategies to realize practical, scalable grid-control solutions. These findings motivate further exploration of IL with advanced techniques (e.g., DAgger, graph-based models) and broader regime testing to generalize to real-world, larger grids.
Abstract
Power grid operation is becoming increasingly complex due to the increase in generation of renewable energy. The recent series of Learning To Run a Power Network (L2RPN) competitions have encouraged the use of artificial agents to assist human dispatchers in operating power grids. In this paper we study the performance of imitation learning for day-ahead power grid operation through topology actions. In particular, we consider two rule-based expert agents: a greedy agent and a N-1 agent. While the latter is more computationally expensive since it takes N-1 safety considerations into account, it exhibits a much higher operational performance. We train a fully-connected neural network (FCNN) on expert state-action pairs and evaluate it in two ways. First, we find that classification accuracy is limited despite extensive hyperparameter tuning, due to class imbalance and class overlap. Second, as a power system agent, the FCNN performs only slightly worse than expert agents. Furthermore, hybrid agents, which incorporate minimal additional simulations, match expert agents' performance with significantly lower computational cost. Consequently, imitation learning shows promise for developing fast, high-performing power grid agents, motivating its further exploration in future L2RPN studies.
