Table of Contents
Fetching ...

Scalable Volt-VAR Optimization using RLlib-IMPALA Framework: A Reinforcement Learning Approach

Alaa Selim, Yanzhu Ye, Junbo Zhao, Bo Yang

TL;DR

The paper tackles scalable Volt-VAR optimization (VVO) in distribution networks with high DER penetration by deploying RLlib-IMPALA on the RAY platform to enable distributed, fast training for high-dimensional control tasks. It introduces an optimal DER placement method for PV and battery resources on the IEEE 123-bus system, and an IMPALA-based centralized control framework that handles continuous and discrete DER actions with state $s_t=[V_1,\dots,V_N, D_1,\dots,D_M]^\top$ and reward $r(s_t,a_t)=-V_{\text{vio}}$, aided by off-policy corrections via $\rho_t$ and V-trace. The results show faster convergence and higher rewards compared with SAC and PPO, with substantial reductions in computation time, while highlighting practical limits related to core usage on single machines. The work has significant implications for real-time, scalable VVO in modern grids and paves the way for applying DRL to even larger networks and more complex DER deployments.

Abstract

In the rapidly evolving domain of electrical power systems, the Volt-VAR optimization (VVO) is increasingly critical, especially with the burgeoning integration of renewable energy sources. Traditional approaches to learning-based VVO in expansive and dynamically changing power systems are often hindered by computational complexities. To address this challenge, our research presents a novel framework that harnesses the potential of Deep Reinforcement Learning (DRL), specifically utilizing the Importance Weighted Actor-Learner Architecture (IMPALA) algorithm, executed on the RAY platform. This framework, built upon RLlib-an industry-standard in Reinforcement Learning-ingeniously capitalizes on the distributed computing capabilities and advanced hyperparameter tuning offered by RAY. This design significantly expedites the exploration and exploitation phases in the VVO solution space. Our empirical results demonstrate that our approach not only surpasses existing DRL methods in achieving superior reward outcomes but also manifests a remarkable tenfold reduction in computational requirements. The integration of our DRL agent with the RAY platform facilitates the creation of RLlib-IMPALA, a novel framework that efficiently uses RAY's resources to improve system adaptability and control. RLlib-IMPALA leverages RAY's toolkit to enhance analytical capabilities and significantly speeds up training to become more than 10 times faster than other state-of-the-art DRL methods.

Scalable Volt-VAR Optimization using RLlib-IMPALA Framework: A Reinforcement Learning Approach

TL;DR

The paper tackles scalable Volt-VAR optimization (VVO) in distribution networks with high DER penetration by deploying RLlib-IMPALA on the RAY platform to enable distributed, fast training for high-dimensional control tasks. It introduces an optimal DER placement method for PV and battery resources on the IEEE 123-bus system, and an IMPALA-based centralized control framework that handles continuous and discrete DER actions with state and reward , aided by off-policy corrections via and V-trace. The results show faster convergence and higher rewards compared with SAC and PPO, with substantial reductions in computation time, while highlighting practical limits related to core usage on single machines. The work has significant implications for real-time, scalable VVO in modern grids and paves the way for applying DRL to even larger networks and more complex DER deployments.

Abstract

In the rapidly evolving domain of electrical power systems, the Volt-VAR optimization (VVO) is increasingly critical, especially with the burgeoning integration of renewable energy sources. Traditional approaches to learning-based VVO in expansive and dynamically changing power systems are often hindered by computational complexities. To address this challenge, our research presents a novel framework that harnesses the potential of Deep Reinforcement Learning (DRL), specifically utilizing the Importance Weighted Actor-Learner Architecture (IMPALA) algorithm, executed on the RAY platform. This framework, built upon RLlib-an industry-standard in Reinforcement Learning-ingeniously capitalizes on the distributed computing capabilities and advanced hyperparameter tuning offered by RAY. This design significantly expedites the exploration and exploitation phases in the VVO solution space. Our empirical results demonstrate that our approach not only surpasses existing DRL methods in achieving superior reward outcomes but also manifests a remarkable tenfold reduction in computational requirements. The integration of our DRL agent with the RAY platform facilitates the creation of RLlib-IMPALA, a novel framework that efficiently uses RAY's resources to improve system adaptability and control. RLlib-IMPALA leverages RAY's toolkit to enhance analytical capabilities and significantly speeds up training to become more than 10 times faster than other state-of-the-art DRL methods.
Paper Structure (5 sections, 2 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 5 sections, 2 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Optimal planning of DERs within the IEEE 123 testing feeder
  • Figure 2: Proposed framework for RLlib-IMPALA
  • Figure 3: Evaluation of the the learning curves of DRL agents on RAY
  • Figure 4: Evaluations of the RLlib-IMPALA on controlling DER setpoints
  • Figure 5: Evaluations of the RLlib-IMPALA on controlling capacitors, transformers and active power injection