AI-Driven Cybersecurity Testbed for Nuclear Infrastructure: Comprehensive Evaluation Using METL Operational Data
Benjamin Blakely, Yeni Li, Akshay Dave, Derek Kultgen, Rick Vilim
TL;DR
The paper addresses AI-driven cybersecurity for nuclear facility OT/IT networks by leveraging the Mechanisms Engineering Test Loop (METL) to create a realistic cyber-physical testbed. It introduces a comprehensive methodology with four detection paradigms (Change Point Detection, LSTM anomaly detection, Dependency Violation analysis, Autoencoder reconstruction) and a 15-scenario attack taxonomy evaluated across five severity tiers, yielding 300 experiments. Key findings show Change Point Detection leading with 0.785 AUC, substantial variation across scenarios, and high detectability for physics-violation attacks, informing deployment strategies. The work provides practical guidance for ensemble, physics-informed defense architectures in critical nuclear infrastructure and establishes benchmarks for operational deployment.
Abstract
Advanced nuclear reactor systems face increasing cybersecurity threats as sophisticated attackers exploit cyber-physical interfaces to manipulate control systems while evading traditional IT security measures. This research presents a comprehensive evaluation of artificial intelligence approaches for cybersecurity protection in nuclear infrastructure, using Argonne National Laboratory's Mechanisms Engineering Test Loop (METL) as an experimental platform. We developed a systematic evaluation framework encompassing four machine learning detection paradigms: Change Point Detection, LSTM-based Anomaly Detection, Dependency Violation analysis, and Autoencoder reconstruction methods. Our comprehensive attack taxonomy includes 15 distinct scenarios targeting reactor control systems, each implemented across five severity tiers to evaluate detection performance under varying attack intensities. The experimental evaluation encompassed 300 rigorous experiments using realistic METL operational data. Change Point Detection emerged as the leading approach with mean AUC performance of 0.785, followed by LSTM Anomaly Detection (0.636), Dependency Violation (0.621), and Autoencoder methods (0.580). Attack detectability varied significantly, with multi-site coordinated attacks proving most detectable (AUC = 0.739) while precision trust decay attacks presented the greatest detection challenge (AUC = 0.592). This work delivers practical performance benchmarks and reference architecture that advance AI-based cybersecurity capabilities for critical nuclear infrastructure, providing essential foundations for operational deployment and enhanced threat response in cyber-physical systems.
