Federated Learning for Efficient Condition Monitoring and Anomaly Detection in Industrial Cyber-Physical Systems
William Marfo, Deepak K. Tosh, Shirley V. Moore
TL;DR
The paper tackles reliable real-time condition monitoring and anomaly localization in industrial CPS by extending federated learning with adaptive client aggregation, dynamic node selection, and Weibull-based checkpointing. It integrates SOM-based anomaly detection and component-level localization, validated on NASA Bearing and Hydraulic Systems datasets, achieving up to 99.5% AUC-ROC and approximately 2× faster execution than FedAvg while remaining resilient to node failures. Statistical validation via the Mann-Whitney U test (p < 0.05) confirms significant improvements in detection performance and efficiency over state-of-the-art FL methods. The framework offers CPS-specific robustness and scalability, enabling efficient, distributed monitoring with reduced downtime and better fault isolation in industrial settings.
Abstract
Detecting and localizing anomalies in cyber-physical systems (CPS) has become increasingly challenging as systems grow in complexity, particularly due to varying sensor reliability and node failures in distributed environments. While federated learning (FL) provides a foundation for distributed model training, existing approaches often lack mechanisms to address these CPS-specific challenges. This paper introduces an enhanced FL framework with three key innovations: adaptive model aggregation based on sensor reliability, dynamic node selection for resource optimization, and Weibull-based checkpointing for fault tolerance. The proposed framework ensures reliable condition monitoring while tackling the computational and reliability challenges of industrial CPS deployments. Experiments on the NASA Bearing and Hydraulic System datasets demonstrate superior performance compared to state-of-the-art FL methods, achieving 99.5% AUC-ROC in anomaly detection and maintaining accuracy even under node failures. Statistical validation using the Mann-Whitney U test confirms significant improvements, with a p-value less than 0.05, in both detection accuracy and computational efficiency across various operational scenarios.
