Table of Contents
Fetching ...

Residual Deep Reinforcement Learning for Inverter-based Volt-Var Control

Qiong Liu, Ye Guo, Lirong Deng, Haotian Liu, Dongyu Li, Hongbin Sun

TL;DR

Simulations demonstrate that RDRL and boosting RDRL improve the optimization performance considerably throughout the learning stage and verify their rationales point-by-point, including 1) inheriting the capability of the approximate model-based optimization, 2) residual policy learning, and 3) learning in a reduced action space.

Abstract

A residual deep reinforcement learning (RDRL) approach is proposed by integrating DRL with model-based optimization for inverter-based volt-var control in active distribution networks when the accurate power flow model is unknown. RDRL learns a residual action with a reduced residual action space, based on the action of the model-based approach with an approximate model. RDRL inherits the control capability of the approximate-model-based optimization and enhances the policy optimization capability by residual policy learning. Additionally, it improves the approximation accuracy of the critic and reduces the search difficulties of the actor by reducing residual action space. To address the issues of "too small" or "too large" residual action space of RDRL and further improve the optimization performance, we extend RDRL to a boosting RDRL approach. It selects a much smaller residual action space and learns a residual policy by using the policy of RDRL as a base policy. Simulations demonstrate that RDRL and boosting RDRL improve the optimization performance considerably throughout the learning stage and verify their rationales point-by-point, including 1) inheriting the capability of the approximate model-based optimization, 2) residual policy learning, and 3) learning in a reduced action space.

Residual Deep Reinforcement Learning for Inverter-based Volt-Var Control

TL;DR

Simulations demonstrate that RDRL and boosting RDRL improve the optimization performance considerably throughout the learning stage and verify their rationales point-by-point, including 1) inheriting the capability of the approximate model-based optimization, 2) residual policy learning, and 3) learning in a reduced action space.

Abstract

A residual deep reinforcement learning (RDRL) approach is proposed by integrating DRL with model-based optimization for inverter-based volt-var control in active distribution networks when the accurate power flow model is unknown. RDRL learns a residual action with a reduced residual action space, based on the action of the model-based approach with an approximate model. RDRL inherits the control capability of the approximate-model-based optimization and enhances the policy optimization capability by residual policy learning. Additionally, it improves the approximation accuracy of the critic and reduces the search difficulties of the actor by reducing residual action space. To address the issues of "too small" or "too large" residual action space of RDRL and further improve the optimization performance, we extend RDRL to a boosting RDRL approach. It selects a much smaller residual action space and learns a residual policy by using the policy of RDRL as a base policy. Simulations demonstrate that RDRL and boosting RDRL improve the optimization performance considerably throughout the learning stage and verify their rationales point-by-point, including 1) inheriting the capability of the approximate model-based optimization, 2) residual policy learning, and 3) learning in a reduced action space.
Paper Structure (14 sections, 18 equations, 10 figures, 1 algorithm)

This paper contains 14 sections, 18 equations, 10 figures, 1 algorithm.

Figures (10)

  • Figure 1: Overall structure of the proposed residual DRL framework. $\mathbb{R}_a$ is the original action space, $\mathbb{R}_{a_r}$ is the residual action space, $a^*$ is the optimal action, $a_m$ is the action of model-based optimization with an approximate model, $a_r^*$ is the optimal residual action, and $a_r$ is the residual action of residual DRL
  • Figure 2: The framework of Residual DRL.
  • Figure 3: The problem of "too small" or "too large" residual action space and their solution: boosting residual DRL (BRDRL).
  • Figure 4: The testing results of the model-based optimization with an accurate model (MBO), the model-based optimization with an approximate model (AMBO), deep reinforcement learning (DRL), residual DRL (RDRL), and boosting RDRL (BRDRL)
  • Figure 5: The reward results of the model-based optimization with an accurate model (MBO), the model-based optimization with an approximate model (AMBO), deep reinforcement learning (DRL), residual DRL (RDRL), and boosting RDRL (BRDRL) in the final 50 episodes. Here, the reward error = the result of model-based optimization with an accurate model - the result of the mentioned method.
  • ...and 5 more figures