Table of Contents
Fetching ...

Safety Constrained Multi-Agent Reinforcement Learning for Active Voltage Control

Yang Qu, Jinming Ma, Feng Wu

TL;DR

This work tackles active voltage control under safety constraints by formulating it as a constrained MARL problem and proposing MA-DELC, a double-safety, Lagrangian-based algorithm. MA-DELC uses two critics to estimate reward and constraint costs and introduces an adaptive Lagrange multiplier with a one-step cost estimator to robustly enforce voltage safety while minimizing losses. Extensive MAPDN simulations across 33-, 141-, and 322-bus networks show MA-DELC achieving near-perfect constraint satisfaction in smaller networks and strong performance in larger, with ablations highlighting the value of double safety estimation and informative cost functions. The method is model-free, scalable, and demonstrates practical potential for real-world distribution networks with high PV penetration.

Abstract

Active voltage control presents a promising avenue for relieving power congestion and enhancing voltage quality, taking advantage of the distributed controllable generators in the power network, such as roof-top photovoltaics. While Multi-Agent Reinforcement Learning (MARL) has emerged as a compelling approach to address this challenge, existing MARL approaches tend to overlook the constrained optimization nature of this problem, failing in guaranteeing safety constraints. In this paper, we formalize the active voltage control problem as a constrained Markov game and propose a safety-constrained MARL algorithm. We expand the primal-dual optimization RL method to multi-agent settings, and augment it with a novel approach of double safety estimation to learn the policy and to update the Lagrange-multiplier. In addition, we proposed different cost functions and investigated their influences on the behavior of our constrained MARL method. We evaluate our approach in the power distribution network simulation environment with real-world scale scenarios. Experimental results demonstrate the effectiveness of the proposed method compared with the state-of-the-art MARL methods. This paper is published at \url{https://www.ijcai.org/Proceedings/2024/}.

Safety Constrained Multi-Agent Reinforcement Learning for Active Voltage Control

TL;DR

This work tackles active voltage control under safety constraints by formulating it as a constrained MARL problem and proposing MA-DELC, a double-safety, Lagrangian-based algorithm. MA-DELC uses two critics to estimate reward and constraint costs and introduces an adaptive Lagrange multiplier with a one-step cost estimator to robustly enforce voltage safety while minimizing losses. Extensive MAPDN simulations across 33-, 141-, and 322-bus networks show MA-DELC achieving near-perfect constraint satisfaction in smaller networks and strong performance in larger, with ablations highlighting the value of double safety estimation and informative cost functions. The method is model-free, scalable, and demonstrates practical potential for real-world distribution networks with high PV penetration.

Abstract

Active voltage control presents a promising avenue for relieving power congestion and enhancing voltage quality, taking advantage of the distributed controllable generators in the power network, such as roof-top photovoltaics. While Multi-Agent Reinforcement Learning (MARL) has emerged as a compelling approach to address this challenge, existing MARL approaches tend to overlook the constrained optimization nature of this problem, failing in guaranteeing safety constraints. In this paper, we formalize the active voltage control problem as a constrained Markov game and propose a safety-constrained MARL algorithm. We expand the primal-dual optimization RL method to multi-agent settings, and augment it with a novel approach of double safety estimation to learn the policy and to update the Lagrange-multiplier. In addition, we proposed different cost functions and investigated their influences on the behavior of our constrained MARL method. We evaluate our approach in the power distribution network simulation environment with real-world scale scenarios. Experimental results demonstrate the effectiveness of the proposed method compared with the state-of-the-art MARL methods. This paper is published at \url{https://www.ijcai.org/Proceedings/2024/}.
Paper Structure (23 sections, 12 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 23 sections, 12 equations, 7 figures, 2 tables, 1 algorithm.

Figures (7)

  • Figure 1: An example of power distribution networks. The black circles with numbers represent buses; G represents an external generator; Ls represent loads; and the sun emoji represents the location where a PV is installed. We control the bus voltage by adjusting the reactive power generation of PV inverters in buses 8, 9, and 14.
  • Figure 2: The topology of the distribution network of the 141-bus and 322-bus scenario in the MAPDN environment.
  • Figure 3: Control Ratio (CR) and Q Loss (QL) results of the different algorithms (CR: higher is better, QL: lower is better).
  • Figure 4: Control Ratio (CR) and Q Loss (QL) results of the ablation studies (CR: higher is better, QL: lower is better).
  • Figure 5: Control Ratio (CR) and Q Loss (QL) results of the different cost functions (CR: higher is better, QL: lower is better).
  • ...and 2 more figures