Physics-informed Actor-Critic for Coordination of Virtual Inertia from Power Distribution Systems
Simon Stock, Davood Babazadeh, Sari Eid, Christian Becker
TL;DR
The paper tackles the challenge of providing inertial support from inverter-based resources in distribution grids lacking accurate models. It introduces Physics-informed Actor-Critic (PI-AC), a model-free reinforcement learning approach that incorporates a physics-based regularization term derived from the swing equation into the critic loss, biasing learning toward physically plausible dynamics. In a case study on the CIGRE 14-bus and IEEE 37-bus systems, PI-AC achieves higher final rewards and faster convergence than a purely data-driven AC and a genetic algorithm, with the benefits amplified under higher renewable penetration. The work demonstrates that physics-informed regularization can enhance learning efficiency and policy quality in power-system coordination tasks, and suggests broad applicability to other physics-constrained RL problems in power engineering.
Abstract
The vanishing inertia of synchronous generators in transmission systems requires the utilization of renewables for inertial support. These are often connected to the distribution system and their support should be coordinated to avoid violation of grid limits. To this end, this paper presents the Physics-informed Actor-Critic (PI-AC) algorithm for coordination of Virtual Inertia (VI) from renewable Inverter-based Resources (IBRs) in power distribution systems. Acquiring a model of the distribution grid can be difficult, since certain parts are often unknown or the parameters are highly uncertain. To favor model-free coordination, Reinforcement Learning (RL) methods can be employed, necessitating a substantial level of training beforehand. The PI-AC is a RL algorithm that integrates the physical behavior of the power system into the Actor-Critic (AC) approach in order to achieve faster learning. To this end, we regularize the loss function with an aggregated power system dynamics model based on the swing equation. Throughout this paper, we explore the PI-AC functionality in a case study with the CIGRE 14-bus and IEEE 37-bus power distribution system in various grid settings. The PI-AC is able to achieve better rewards and faster learning than the exclusively data-driven AC algorithm and the metaheuristic Genetic Algorithm (GA).
