Safety Constrained Multi-Agent Reinforcement Learning for Active Voltage Control
Yang Qu, Jinming Ma, Feng Wu
TL;DR
This work tackles active voltage control under safety constraints by formulating it as a constrained MARL problem and proposing MA-DELC, a double-safety, Lagrangian-based algorithm. MA-DELC uses two critics to estimate reward and constraint costs and introduces an adaptive Lagrange multiplier with a one-step cost estimator to robustly enforce voltage safety while minimizing losses. Extensive MAPDN simulations across 33-, 141-, and 322-bus networks show MA-DELC achieving near-perfect constraint satisfaction in smaller networks and strong performance in larger, with ablations highlighting the value of double safety estimation and informative cost functions. The method is model-free, scalable, and demonstrates practical potential for real-world distribution networks with high PV penetration.
Abstract
Active voltage control presents a promising avenue for relieving power congestion and enhancing voltage quality, taking advantage of the distributed controllable generators in the power network, such as roof-top photovoltaics. While Multi-Agent Reinforcement Learning (MARL) has emerged as a compelling approach to address this challenge, existing MARL approaches tend to overlook the constrained optimization nature of this problem, failing in guaranteeing safety constraints. In this paper, we formalize the active voltage control problem as a constrained Markov game and propose a safety-constrained MARL algorithm. We expand the primal-dual optimization RL method to multi-agent settings, and augment it with a novel approach of double safety estimation to learn the policy and to update the Lagrange-multiplier. In addition, we proposed different cost functions and investigated their influences on the behavior of our constrained MARL method. We evaluate our approach in the power distribution network simulation environment with real-world scale scenarios. Experimental results demonstrate the effectiveness of the proposed method compared with the state-of-the-art MARL methods. This paper is published at \url{https://www.ijcai.org/Proceedings/2024/}.
