Table of Contents
Fetching ...

Stream-Based Active Learning for Process Monitoring

Christian Capezza, Antonio Lepore, Kamran Paynabar

TL;DR

A novel stream-based active learning strategy for SPM that selects the most informative data points for labeling under a limited budget and extends pool-based active learning to real-time settings by providing labeling decisions for each incoming data point is proposed.

Abstract

Statistical process monitoring (SPM) methods are essential tools in quality management to check the stability of industrial processes, i.e., to dynamically classify the process state as in control (IC), under normal operating conditions, or out of control (OC), otherwise. Traditional SPM methods are based on unsupervised approaches, which are popular because in most industrial applications the true OC states of the process are not explicitly known. This hampered the development of supervised methods that could instead take advantage of process data containing labels on the true process state, although they still need improvement in dealing with class imbalance, as OC states are rare in high-quality processes, and the dynamic recognition of unseen classes, e.g., the number of possible OC states. This article presents a novel stream-based active learning strategy for SPM that enhances partially hidden Markov models to deal with data streams. The ultimate goal is to optimize labeling resources constrained by a limited budget and dynamically update the possible OC states. The proposed method performance in classifying the true state of the process is assessed through a simulation and a case study on the SPM of a resistance spot welding process in the automotive industry, which motivated this research.

Stream-Based Active Learning for Process Monitoring

TL;DR

A novel stream-based active learning strategy for SPM that selects the most informative data points for labeling under a limited budget and extends pool-based active learning to real-time settings by providing labeling decisions for each incoming data point is proposed.

Abstract

Statistical process monitoring (SPM) methods are essential tools in quality management to check the stability of industrial processes, i.e., to dynamically classify the process state as in control (IC), under normal operating conditions, or out of control (OC), otherwise. Traditional SPM methods are based on unsupervised approaches, which are popular because in most industrial applications the true OC states of the process are not explicitly known. This hampered the development of supervised methods that could instead take advantage of process data containing labels on the true process state, although they still need improvement in dealing with class imbalance, as OC states are rare in high-quality processes, and the dynamic recognition of unseen classes, e.g., the number of possible OC states. This article presents a novel stream-based active learning strategy for SPM that enhances partially hidden Markov models to deal with data streams. The ultimate goal is to optimize labeling resources constrained by a limited budget and dynamically update the possible OC states. The proposed method performance in classifying the true state of the process is assessed through a simulation and a case study on the SPM of a resistance spot welding process in the automotive industry, which motivated this research.

Paper Structure

This paper contains 19 sections, 20 equations, 8 figures, 1 algorithm.

Figures (8)

  • Figure 1: Flow diagram of the proposed method.
  • Figure 2: Graphical representation of a hidden Markov model.
  • Figure 3: Example of a simulated sequence of states. The state sequence is divided into three parts. The first part ($t \in \lbrace 1,\dots,T_{\text{init}}=100 \rbrace$) contains only initial IC state 1 values, in the second part ($t \in \lbrace 101,\dots, 350 \rbrace$) the process transitions to OC state 2, in the third part ($t \in \lbrace 351,\dots,600 \rbrace$) the process transitions to a new OC state 3. A line connects all the simulated states.
  • Figure 4: F1 score plotted as a function of the shift size for each available budget and number of variables.
  • Figure 5: F1 score, precision, and recall scores achieved with the proposed method, plotted as a function of shift size, for each available budget and $p=20$. Each line corresponds to a given weight of the exploitation criterion $w^{\text{exp}}$.
  • ...and 3 more figures