Table of Contents
Fetching ...

The Wisdom of the Crowd: High-Fidelity Classification of Cyber-Attacks and Faults in Power Systems Using Ensemble and Machine Learning

Emad Abukhousa, Syed Sohail Feroz Syed Afroz, Fahad Alsaeed, Abdulaziz Qwbaiban, Saman Zonouz, A. P. Sakis Meliopoulos

TL;DR

This work tackles the challenge of reliably classifying cyber-attacks and physical faults in power systems with high inverter-based resource (IBR) penetration. It introduces a high-fidelity, streaming-aware evaluation framework and benchmarks 12 ML models on EMT simulations (WinIGS) with COMTRADE-format data, using cycle-aware post-processing and a confidence threshold to stabilize decisions. The study finds that offline accuracies can approach 99.9% yet streaming performance varies markedly, with MLPs achieving the highest coverage (≈98–99%) and ensembles remaining precise but often abstaining; the results also show a sub-cycle latency challenge, with average inference around ~60 ms exceeding a 50 ms relay target. By releasing open data and code, the paper provides a reproducible baseline and underscores the need for streaming-aware evaluation to guide deployment of protection strategies in IBR-rich grids.

Abstract

This paper presents a high-fidelity evaluation framework for machine learning (ML)-based classification of cyber-attacks and physical faults using electromagnetic transient simulations with digital substation emulation at 4.8 kHz. Twelve ML models, including ensemble algorithms and a multi-layer perceptron (MLP), were trained on labeled time-domain measurements and evaluated in a real-time streaming environment designed for sub-cycle responsiveness. The architecture incorporates a cycle-length smoothing filter and confidence threshold to stabilize decisions. Results show that while several models achieved near-perfect offline accuracies (up to 99.9%), only the MLP sustained robust coverage (98-99%) under streaming, whereas ensembles preserved perfect anomaly precision but abstained frequently (10-49% coverage). These findings demonstrate that offline accuracy alone is an unreliable indicator of field readiness and underscore the need for realistic testing and inference pipelines to ensure dependable classification in inverter-based resources (IBR)-rich networks.

The Wisdom of the Crowd: High-Fidelity Classification of Cyber-Attacks and Faults in Power Systems Using Ensemble and Machine Learning

TL;DR

This work tackles the challenge of reliably classifying cyber-attacks and physical faults in power systems with high inverter-based resource (IBR) penetration. It introduces a high-fidelity, streaming-aware evaluation framework and benchmarks 12 ML models on EMT simulations (WinIGS) with COMTRADE-format data, using cycle-aware post-processing and a confidence threshold to stabilize decisions. The study finds that offline accuracies can approach 99.9% yet streaming performance varies markedly, with MLPs achieving the highest coverage (≈98–99%) and ensembles remaining precise but often abstaining; the results also show a sub-cycle latency challenge, with average inference around ~60 ms exceeding a 50 ms relay target. By releasing open data and code, the paper provides a reproducible baseline and underscores the need for streaming-aware evaluation to guide deployment of protection strategies in IBR-rich grids.

Abstract

This paper presents a high-fidelity evaluation framework for machine learning (ML)-based classification of cyber-attacks and physical faults using electromagnetic transient simulations with digital substation emulation at 4.8 kHz. Twelve ML models, including ensemble algorithms and a multi-layer perceptron (MLP), were trained on labeled time-domain measurements and evaluated in a real-time streaming environment designed for sub-cycle responsiveness. The architecture incorporates a cycle-length smoothing filter and confidence threshold to stabilize decisions. Results show that while several models achieved near-perfect offline accuracies (up to 99.9%), only the MLP sustained robust coverage (98-99%) under streaming, whereas ensembles preserved perfect anomaly precision but abstained frequently (10-49% coverage). These findings demonstrate that offline accuracy alone is an unreliable indicator of field readiness and underscore the need for realistic testing and inference pipelines to ensure dependable classification in inverter-based resources (IBR)-rich networks.

Paper Structure

This paper contains 12 sections, 7 equations, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: High-fidelity microgrid testbed with inverter-based resources (IBR) integration.
  • Figure 2: Model responses to five sequential anomalies: SLG A-N (1.0--1.2 s), LL B-C (2.0--2.2 s), DLG AC-N (3.0--3.2 s, red), CT ratio attack MU32 (4.0--4.2 s), and PT ratio attack MU23 (5.0--5.2 s), each $\sim$0.2 s duration.
  • Figure 3: MLP neural network response to five testing anomalies showing waveforms and confidence scores
  • Figure 4: Zoomed-in view of the MLP model response to an LLG fault and CT attack, demonstrating correct classification of inter-event transients
  • Figure 5: MLP model response during initial system energization, correctly classifying high inrush currents as normal operation