On the Use of Reinforcement Learning for Attacking and Defending Load Frequency Control
Amr S. Mohamed, Deepa Kundur
TL;DR
This work addresses vulnerabilities in load frequency control (LFC) by applying deep reinforcement learning to synthesize false data injection and load-switching attacks, producing diverse attack datasets on microgrid models. The authors develop an RL-based attacker using deep deterministic policy gradient (DDPG) and leverage RL-generated data to train a supervised LSTM detector and compare it with a state-of-the-art unsupervised autoencoder detector, highlighting strengths and weaknesses of each approach. They also propose an integrated defense that combines anomaly detection with supervised attack detection, achieving high accuracy in identifying malicious activity while reducing false positives. The study demonstrates how RL-based attack synthesis can inform vulnerability testing and defense design, enabling proactive security for cyber-physical power systems despite limitations such as training time and model simplifications.
Abstract
The electric grid is an attractive target for cyberattackers given its critical nature in society. With the increasing sophistication of cyberattacks, effective grid defense will benefit from proactively identifying vulnerabilities and attack strategies. We develop a deep reinforcement learning-based method that recognizes vulnerabilities in load frequency control, an essential process that maintains grid security and reliability. We demonstrate how our method can synthesize a variety of attacks involving false data injection and load switching, while specifying the attack and threat models - providing insight into potential attack strategies and impact. We discuss how our approach can be employed for testing electric grid vulnerabilities. Moreover our method can be employed to generate data to inform the design of defense strategies and develop attack detection methods. For this, we design and compare a (deep learning-based) supervised attack detector with an unsupervised anomaly detector to highlight the benefits of developing defense strategies based on identified attack strategies.
