Table of Contents
Fetching ...

Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams

Natalia Wojak-Strzelecka, Szymon Bobek, Grzegorz J. Nalepa, Jerzy Stefanowski

Abstract

Anomaly and failure detection methods are crucial in identifying deviations from normal system operational conditions, which allows for actions to be taken in advance, usually preventing more serious damages. Long-lasting deviations indicate failures, while sudden, isolated changes in the data indicate anomalies. However, in many practical applications, changes in the data do not always represent abnormal system states. Such changes may be recognized incorrectly as failures, while being a normal evolution of the system, e.g. referring to characteristics of starting the processing of a new product, i.e. realizing a domain shift. Therefore, distinguishing between failures and such ''healthy'' changes in data distribution is critical to ensure the practical robustness of the system. In this paper, we propose a method that not only detects changes in the data distribution and anomalies but also allows us to distinguish between failures and normal domain shifts inherent to a given process. The proposed method consists of a modified Page-Hinkley changepoint detector for identification of the domain shift and possible failures and supervised domain-adaptation-based algorithms for fast, online anomaly detection. These two are coupled with an explainable artificial intelligence (XAI) component that aims at helping the human operator to finally differentiate between domain shifts and failures. The method is illustrated by an experiment on a data stream from the steel factory.

Towards Differentiating Between Failures and Domain Shifts in Industrial Data Streams

Abstract

Anomaly and failure detection methods are crucial in identifying deviations from normal system operational conditions, which allows for actions to be taken in advance, usually preventing more serious damages. Long-lasting deviations indicate failures, while sudden, isolated changes in the data indicate anomalies. However, in many practical applications, changes in the data do not always represent abnormal system states. Such changes may be recognized incorrectly as failures, while being a normal evolution of the system, e.g. referring to characteristics of starting the processing of a new product, i.e. realizing a domain shift. Therefore, distinguishing between failures and such ''healthy'' changes in data distribution is critical to ensure the practical robustness of the system. In this paper, we propose a method that not only detects changes in the data distribution and anomalies but also allows us to distinguish between failures and normal domain shifts inherent to a given process. The proposed method consists of a modified Page-Hinkley changepoint detector for identification of the domain shift and possible failures and supervised domain-adaptation-based algorithms for fast, online anomaly detection. These two are coupled with an explainable artificial intelligence (XAI) component that aims at helping the human operator to finally differentiate between domain shifts and failures. The method is illustrated by an experiment on a data stream from the steel factory.
Paper Structure (11 sections, 3 equations, 7 figures, 1 table)

This paper contains 11 sections, 3 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The flow chart for differentiating between failures and domain shifts. The changepoint detection algorithm marks possible domain shifts and failures. After that, the domain adaptation algorithm updates the model that detects anomalies. Simultaneously human operator accompanied by the XAI algorithm decides if the change in the data represents a healthy domain shift or a failure.
  • Figure 2: Raw signal for current_2 on the left, SHAP values plotted in batches for current_2 on the right.
  • Figure 3: The schematic diagram of four stand cold rollling mill.
  • Figure 4: Anomaly detection algorithms evaluation on target products on important signals from rolling stand 2.
  • Figure 5: SHAP values median for: a) all parameters (on left), b) for important parameters for anomaly detection (on right).
  • ...and 2 more figures