Table of Contents
Fetching ...

Policy Gradient-Based EMT-in-the-Loop Learning to Mitigate Sub-Synchronous Control Interactions

Sayak Mukherjee, Ramij R. Hossain, Kaustav Chatterjee, Sameer Nekkalapu, Marcelo Elizondo

TL;DR

Addressing SSCIs caused by mis-tuned inverter controls under specific grid configurations, the paper develops an EMT-in-the-loop framework that learns adaptive outer/inner gains using a simple deep policy-gradient RL agent embedded in PSCAD. The approach formulates SSCI mitigation as an MDP with continuous state and action spaces, employs SSCI-specific data processing (down-sampling and band-pass filtering) to form observation windows, and uses an energy-based reward $R=-E_{osc}$ where $E_{osc}=\int_0^{T_w} (P_f(\tau)-P_{nom})^2 d\tau$. The utility is demonstrated on a real-world Texas SSCI scenario, showing the policy gradually reduces oscillation energy and suppresses detrimental dynamics when deployed mid-event. The work highlights practical gains in adaptive damping for inverter-rich grids and reduces EMT simulation burden via restricted action exploration.

Abstract

This paper explores the development of learning-based tunable control gains using EMT-in-the-loop simulation framework (e.g., PSCAD interfaced with Python-based learning modules) to address critical sub-synchronous oscillations. Since sub-synchronous control interactions (SSCI) arise from the mis-tuning of control gains under specific grid configurations, effective mitigation strategies require adaptive re-tuning of these gains. Such adaptiveness can be achieved by employing a closed-loop, learning-based framework that considers the grid conditions responsible for such sub-synchronous oscillations. This paper addresses this need by adopting methodologies inspired by Markov decision process (MDP) based reinforcement learning (RL), with a particular emphasis on simpler deep policy gradient methods with additional SSCI-specific signal processing modules such as down-sampling, bandpass filtering, and oscillation energy dependent reward computations. Our experimentation in a real-world event setting demonstrates that the deep policy gradient based trained policy can adaptively compute gain settings in response to varying grid conditions and optimally suppress control interaction-induced oscillations.

Policy Gradient-Based EMT-in-the-Loop Learning to Mitigate Sub-Synchronous Control Interactions

TL;DR

Addressing SSCIs caused by mis-tuned inverter controls under specific grid configurations, the paper develops an EMT-in-the-loop framework that learns adaptive outer/inner gains using a simple deep policy-gradient RL agent embedded in PSCAD. The approach formulates SSCI mitigation as an MDP with continuous state and action spaces, employs SSCI-specific data processing (down-sampling and band-pass filtering) to form observation windows, and uses an energy-based reward where . The utility is demonstrated on a real-world Texas SSCI scenario, showing the policy gradually reduces oscillation energy and suppresses detrimental dynamics when deployed mid-event. The work highlights practical gains in adaptive damping for inverter-rich grids and reduces EMT simulation burden via restricted action exploration.

Abstract

This paper explores the development of learning-based tunable control gains using EMT-in-the-loop simulation framework (e.g., PSCAD interfaced with Python-based learning modules) to address critical sub-synchronous oscillations. Since sub-synchronous control interactions (SSCI) arise from the mis-tuning of control gains under specific grid configurations, effective mitigation strategies require adaptive re-tuning of these gains. Such adaptiveness can be achieved by employing a closed-loop, learning-based framework that considers the grid conditions responsible for such sub-synchronous oscillations. This paper addresses this need by adopting methodologies inspired by Markov decision process (MDP) based reinforcement learning (RL), with a particular emphasis on simpler deep policy gradient methods with additional SSCI-specific signal processing modules such as down-sampling, bandpass filtering, and oscillation energy dependent reward computations. Our experimentation in a real-world event setting demonstrates that the deep policy gradient based trained policy can adaptively compute gain settings in response to varying grid conditions and optimally suppress control interaction-induced oscillations.

Paper Structure

This paper contains 5 sections, 17 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Schematic of the AI-based Mitigation of SSCI with EMT-in-th-loop Training
  • Figure 2: Single-line diagram of the test system, used to replicate the SSCI event from Texas, showing the path of oscillation energy flow or control interaction.
  • Figure 4: $abc$ phase waveforms and $dq$ components of the current injected by the WG in the Texas system.
  • Figure 6: Policy gradient training progress