Table of Contents
Fetching ...

DyEdgeGAT: Dynamic Edge via Graph Attention for Early Fault Detection in IIoT Systems

Mengjie Zhao, Olga Fink

TL;DR

DyEdgeGAT significantly outperforms other baseline methods in FD, particularly in the early stages with low severity, and exhibits robust performance under novel OCs, enhancing its accuracy and robustness.

Abstract

In the Industrial Internet of Things (IIoT), condition monitoring sensor signals from complex systems often exhibit nonlinear and stochastic spatial-temporal dynamics under varying conditions. These complex dynamics make fault detection particularly challenging. While previous methods effectively model these dynamics, they often neglect the evolution of relationships between sensor signals. Undetected shifts in these relationships can lead to significant system failures. Furthermore, these methods frequently misidentify novel operating conditions as faults. Addressing these limitations, we propose DyEdgeGAT (Dynamic Edge via Graph Attention), a novel approach for early-stage fault detection in IIoT systems. DyEdgeGAT's primary innovation lies in a novel graph inference scheme for multivariate time series that tracks the evolution of relationships between time series, enabled by dynamic edge construction. Another key innovation of DyEdgeGAT is its ability to incorporate operating condition contexts into node dynamics modeling, enhancing its accuracy and robustness. We rigorously evaluated DyEdgeGAT using both a synthetic dataset, simulating varying levels of fault severity, and a real-world industrial-scale multiphase flow facility benchmark with diverse fault types under varying operating conditions and detection complexities. The results show that DyEdgeGAT significantly outperforms other baseline methods in fault detection, particularly in the early stages with low severity, and exhibits robust performance under novel operating conditions.

DyEdgeGAT: Dynamic Edge via Graph Attention for Early Fault Detection in IIoT Systems

TL;DR

DyEdgeGAT significantly outperforms other baseline methods in FD, particularly in the early stages with low severity, and exhibits robust performance under novel OCs, enhancing its accuracy and robustness.

Abstract

In the Industrial Internet of Things (IIoT), condition monitoring sensor signals from complex systems often exhibit nonlinear and stochastic spatial-temporal dynamics under varying conditions. These complex dynamics make fault detection particularly challenging. While previous methods effectively model these dynamics, they often neglect the evolution of relationships between sensor signals. Undetected shifts in these relationships can lead to significant system failures. Furthermore, these methods frequently misidentify novel operating conditions as faults. Addressing these limitations, we propose DyEdgeGAT (Dynamic Edge via Graph Attention), a novel approach for early-stage fault detection in IIoT systems. DyEdgeGAT's primary innovation lies in a novel graph inference scheme for multivariate time series that tracks the evolution of relationships between time series, enabled by dynamic edge construction. Another key innovation of DyEdgeGAT is its ability to incorporate operating condition contexts into node dynamics modeling, enhancing its accuracy and robustness. We rigorously evaluated DyEdgeGAT using both a synthetic dataset, simulating varying levels of fault severity, and a real-world industrial-scale multiphase flow facility benchmark with diverse fault types under varying operating conditions and detection complexities. The results show that DyEdgeGAT significantly outperforms other baseline methods in fault detection, particularly in the early stages with low severity, and exhibits robust performance under novel operating conditions.
Paper Structure (35 sections, 14 equations, 8 figures, 11 tables)

This paper contains 35 sections, 14 equations, 8 figures, 11 tables.

Figures (8)

  • Figure 1: Overview of the Dynamic Edge via Graph Attention (DyEdgeGAT). Starting from raw sensor measurements $\mathbf{X}^{t_w:t}$ and system-independent variables $\mathbf{U}^{t_w:t}$, the process involves: (1) Dynamic edge construction, where the model infers and tracks evolving interdependencies between time series. (2) Operating condition aware node dynamics extraction, augmented by operating condition context via GRU and Layer Normalization (LN) modules. (3) Dynamic interaction learning, with two Graph Isomorphism Network (GIN) layers and a Batch Normalization (BN) layer in between. (4) Reverse signal reconstruction augments operating condition context and reconstructs the original sensor signals in the reversed order. (5) Temporal topology-informed anomaly scoring, leveraging the learned temporal graph structure to balance different strengths of dynamics in the heterogeneous signals. In the training phase, the model minimizes reconstruction loss using normal data. During the testing phase, the model employs reconstruction discrepancies, adjusted by interaction strengths among sensor nodes for anomaly scoring.
  • Figure 2: Pronto Dataset Fault Class Statistics: (a) t-SNE embedding space illustrating normal and faulty raw sequence data under two flow conditions. (b) Violin plot showing the density distribution of the process variable air outlet valve 3-phase separator (PIC501).
  • Figure 3: Comparison of model performance across different scaling factors on the synthetic dataset. A scaling factor closer to 1 indicates lower fault severity. The shaded areas around the performance lines indicate variance in model performance across multiple runs.
  • Figure 4: Comparison of fault detection performance on the Pronto Dataset under different fault types.
  • Figure 5: Comparative model evaluation on the Pronto dataset under novel operating conditions, specifically for the slugging condition. The ambiguity metric reflects the model's disability to distinguish between normal and novel operating conditions.
  • ...and 3 more figures