Table of Contents
Fetching ...

Temporal Prototype-Aware Learning for Active Voltage Control on Power Distribution Networks

Feiyang Xu, Shunyu Liu, Yunpeng Qing, Yihe Zhou, Yuwen Wang, Mingli Song

TL;DR

This paper tackles the challenge of sustaining effective active voltage control on power distribution networks under long-term temporal distribution shifts, using short-term training trajectories. It proposes Temporal Prototype-Aware learning (TPA), which combines a multi-scale dynamic encoder (a stacked transformer) to capture multi-timescale temporal dependencies with a temporal prototype-aware policy that matches ongoing states to seasonal prototypes for adaptive control. Key contributions include the design of 24 seasonal prototypes, a prototype-based retrieval mechanism, and ablation analyses showing the importance of short-term memory and temporal priors, along with transferability across PDN sizes. Empirical results on MAPDN benchmarks (141- and 322-bus) demonstrate that TPA outperforms state-of-the-art MARL baselines in both singular diurnal and longer operational cycles, offering improved controllable rate and reduced reactive-power loss, with practical implications for scalable, time-adaptive AVC in real-world grids.

Abstract

Active Voltage Control (AVC) on the Power Distribution Networks (PDNs) aims to stabilize the voltage levels to ensure efficient and reliable operation of power systems. With the increasing integration of distributed energy resources, recent efforts have explored employing multi-agent reinforcement learning (MARL) techniques to realize effective AVC. Existing methods mainly focus on the acquisition of short-term AVC strategies, i.e., only learning AVC within the short-term training trajectories of a singular diurnal cycle. However, due to the dynamic nature of load demands and renewable energy, the operation states of real-world PDNs may exhibit significant distribution shifts across varying timescales (e.g., daily and seasonal changes). This can render those short-term strategies suboptimal or even obsolete when performing continuous AVC over extended periods. In this paper, we propose a novel temporal prototype-aware learning method, abbreviated as TPA, to learn time-adaptive AVC under short-term training trajectories. At the heart of TPA are two complementary components, namely multi-scale dynamic encoder and temporal prototype-aware policy, that can be readily incorporated into various MARL methods. The former component integrates a stacked transformer network to learn underlying temporal dependencies at different timescales of the PDNs, while the latter implements a learnable prototype matching mechanism to construct a dedicated AVC policy that can dynamically adapt to the evolving operation states. Experimental results on the AVC benchmark with different PDN sizes demonstrate that the proposed TPA surpasses the state-of-the-art counterparts not only in terms of control performance but also by offering model transferability. Our code is available at https://github.com/Canyizl/TPA-for-AVC.

Temporal Prototype-Aware Learning for Active Voltage Control on Power Distribution Networks

TL;DR

This paper tackles the challenge of sustaining effective active voltage control on power distribution networks under long-term temporal distribution shifts, using short-term training trajectories. It proposes Temporal Prototype-Aware learning (TPA), which combines a multi-scale dynamic encoder (a stacked transformer) to capture multi-timescale temporal dependencies with a temporal prototype-aware policy that matches ongoing states to seasonal prototypes for adaptive control. Key contributions include the design of 24 seasonal prototypes, a prototype-based retrieval mechanism, and ablation analyses showing the importance of short-term memory and temporal priors, along with transferability across PDN sizes. Empirical results on MAPDN benchmarks (141- and 322-bus) demonstrate that TPA outperforms state-of-the-art MARL baselines in both singular diurnal and longer operational cycles, offering improved controllable rate and reduced reactive-power loss, with practical implications for scalable, time-adaptive AVC in real-world grids.

Abstract

Active Voltage Control (AVC) on the Power Distribution Networks (PDNs) aims to stabilize the voltage levels to ensure efficient and reliable operation of power systems. With the increasing integration of distributed energy resources, recent efforts have explored employing multi-agent reinforcement learning (MARL) techniques to realize effective AVC. Existing methods mainly focus on the acquisition of short-term AVC strategies, i.e., only learning AVC within the short-term training trajectories of a singular diurnal cycle. However, due to the dynamic nature of load demands and renewable energy, the operation states of real-world PDNs may exhibit significant distribution shifts across varying timescales (e.g., daily and seasonal changes). This can render those short-term strategies suboptimal or even obsolete when performing continuous AVC over extended periods. In this paper, we propose a novel temporal prototype-aware learning method, abbreviated as TPA, to learn time-adaptive AVC under short-term training trajectories. At the heart of TPA are two complementary components, namely multi-scale dynamic encoder and temporal prototype-aware policy, that can be readily incorporated into various MARL methods. The former component integrates a stacked transformer network to learn underlying temporal dependencies at different timescales of the PDNs, while the latter implements a learnable prototype matching mechanism to construct a dedicated AVC policy that can dynamically adapt to the evolving operation states. Experimental results on the AVC benchmark with different PDN sizes demonstrate that the proposed TPA surpasses the state-of-the-art counterparts not only in terms of control performance but also by offering model transferability. Our code is available at https://github.com/Canyizl/TPA-for-AVC.

Paper Structure

This paper contains 28 sections, 11 equations, 8 figures, 7 tables.

Figures (8)

  • Figure 1: An example of the PDN. Each bus is indexed by a circle with a number. "$\mathbf{G}$" denotes the external generator. "$\mathbf{L}$" denotes loads. "sun" denotes the location of a PV installed. We control the voltages on bus 2--12. Bus 0--1 represents the main system with the constant voltage outside the PDN.
  • Figure 2: Illustration of the proposed method. The module (i) is employed for extracting multi-scale temporal dependencies while module (ii) constructs a prototype matching mechanism to enable agents to dynamically adjust their strategies.
  • Figure 3: Median CR and QL of algorithms with different voltage barrier functions. The sub-caption indicates metric-Barrier-scenario. "TPA-" refers to the combination of our framework with other methods, while "T-" represents the incorporation of the previous TMAAC with other methods. All experimental results are illustrated with the mean and the standard deviation of the metrics over 5 random seeds for a fair comparison. To make the results clearer for readers, we adopt a 50% confidence interval to plot the error region.
  • Figure 4: Training curves and test results of TPA and its ablations on the IEEE 322-bus system. "S" denotes the seasonal labels and "M" denotes short-term memory. "P" represents the learnable prototypes. "T-" represents the previous TMAAC method. "TPA" selects MADDPG as the basic algorithm.
  • Figure 5: Training curves and test results of transferability experiment on the 141-bus network. "Origin" denotes the normal TPA model. "Pretrained" denotes the TPA model with pre-trained prototypes on the 322-bus network. MADDPG is selected as the basic algorithm.
  • ...and 3 more figures