A Reinforcement Learning Engine with Reduced Action and State Space for Scalable Cyber-Physical Optimal Response
Shining Sun, Khandaker Akramul Haque, Xiang Huo, Leen Al Homoud, Shamina Hossain-McKenzie, Ana Goulart, Katherine Davis
TL;DR
The paper tackles the challenge of scalable, optimal responses for cyber-physical power systems under disturbances, notably DoS attacks. It introduces RL-RID-GridResponder, a reinforcement learning engine augmented with Role and Interaction Discovery (RID) to shrink action and state spaces by identifying essential, critical, and redundant controllers and fusing cyber-physical data. The approach uses multimodal data fusion (PCA then $t$-SNE), a state-evaluation module, RID, and policy-based RL (PPO and A2C) to perform Volt-Var control under DoS conditions, validated on augmented WSCC 9-bus and IEEE 24-bus test systems within the RESLab/PowerGym/OpenDSS environment. Results show PPO with RID achieves faster convergence and maintains voltages within $\pm 5\%$ while reducing the action space by about $15$–$17\%$, illustrating practical improvements in resilience and real-time operation of large-scale CPS power systems.
Abstract
Numerous research studies have been conducted to enhance the resilience of cyber-physical systems (CPSs) by detecting potential cyber or physical disturbances. However, the development of scalable and optimal response measures under power system contingency based on fusing cyber-physical data is still in an early stage. To address this research gap, this paper introduces a power system response engine based on reinforcement learning (RL) and role and interaction discovery (RID) techniques. RL-RID-GridResponder is designed to automatically detect the contingency and assist with the decision-making process to ensure optimal power system operation. The RL-RID-GridResponder learns via an RL-based structure and achieves enhanced scalability by integrating an RID module with reduced action and state spaces. The applicability of RL-RID-GridResponder in providing scalable and optimal responses for CPSs is demonstrated on power systems in the context of Denial of Service (DoS) attacks. Moreover, simulations are conducted on a Volt-Var regulation problem using the augmented WSCC 9-bus and augmented IEEE 24-bus systems based on fused cyber and physical data sets. The results show that the proposed RL-RID-GridResponder can provide fast and accurate responses to ensure optimal power system operation under DoS and can extend to other system contingencies such as line outages and loss of loads.
