Predicting machine failures from multivariate time series: an industrial case study

Nicolò Oreste Pinciroli Vago; Francesca Forbicini; Piero Fraternali

Predicting machine failures from multivariate time series: an industrial case study

Nicolò Oreste Pinciroli Vago, Francesca Forbicini, Piero Fraternali

TL;DR

This study compares non-neural ML and DL approaches for predicting industrial machine failures from multivariate time series, explicitly varying reading window ($RW$) and prediction window ($PW$ to assess forecast horizon effects on performance, measured by macro $F_1$. It analyzes three diverse datasets (Wrapping machine, Blood refrigerator, Nitrogen generator) with distinct temporal patterns, using LR, RF, SVM, LSTM, ConvLSTM, and Transformer models. The results show that DL methods offer substantial gains for complex, diverse fault precursors (notably the wrapping machine), while simpler, repetitive fault patterns in the other datasets yield comparable performance between ML and DL; in all cases, very long horizons degrade predictive power. The work highlights the practical importance of selecting domain-appropriate $RW$ and $PW$, demonstrates how class imbalance is managed, and provides datasets and code to support reproducibility and extensions in predictive maintenance applications.

Abstract

Non-neural Machine Learning (ML) and Deep Learning (DL) models are often used to predict system failures in the context of industrial maintenance. However, only a few researches jointly assess the effect of varying the amount of past data used to make a prediction and the extension in the future of the forecast. This study evaluates the impact of the size of the reading window and of the prediction window on the performances of models trained to forecast failures in three data sets concerning the operation of (1) an industrial wrapping machine working in discrete sessions, (2) an industrial blood refrigerator working continuously, and (3) a nitrogen generator working continuously. The problem is formulated as a binary classification task that assigns the positive label to the prediction window based on the probability of a failure to occur in such an interval. Six algorithms (logistic regression, random forest, support vector machine, LSTM, ConvLSTM, and Transformers) are compared using multivariate telemetry time series. The results indicate that, in the considered scenarios, the dimension of the prediction windows plays a crucial role and highlight the effectiveness of DL approaches at classifying data with diverse time-dependent patterns preceding a failure and the effectiveness of ML approaches at classifying similar and repetitive patterns preceding a failure.

Predicting machine failures from multivariate time series: an industrial case study

TL;DR

This study compares non-neural ML and DL approaches for predicting industrial machine failures from multivariate time series, explicitly varying reading window (

) and prediction window (

to assess forecast horizon effects on performance, measured by macro

. It analyzes three diverse datasets (Wrapping machine, Blood refrigerator, Nitrogen generator) with distinct temporal patterns, using LR, RF, SVM, LSTM, ConvLSTM, and Transformer models. The results show that DL methods offer substantial gains for complex, diverse fault precursors (notably the wrapping machine), while simpler, repetitive fault patterns in the other datasets yield comparable performance between ML and DL; in all cases, very long horizons degrade predictive power. The work highlights the practical importance of selecting domain-appropriate

and

, demonstrates how class imbalance is managed, and provides datasets and code to support reproducibility and extensions in predictive maintenance applications.

Abstract

Paper Structure (20 sections, 5 equations, 11 figures, 9 tables, 1 algorithm)

This paper contains 20 sections, 5 equations, 11 figures, 9 tables, 1 algorithm.

Introduction
Related work
Method
Data sets
Wrapping machine
Blood refrigerator
Nitrogen generator
Data Processing
Wrapping machine
Blood refrigerator
Nitrogen generator
Definition of the Reading and Prediction Windows
Class Unbalance
Algorithms and hyperparameter tuning
Training and evaluation
...and 5 more sections

Figures (11)

Figure 1: Distribution of the alert codes on a logarithmic scale. Alert 34 is the most frequent but must be discarded because it is unreliable. Alert 11 is the second most frequent. The remaining alerts have less than ten occurrences in the entire data set.
Figure 2: Examples of diverse patterns preceding faults in the wrapping machine data set. The patterns involve three variables and the faults are displayed as dashed vertical lines. The example on the left shows two faults observed on 15-06-2021. They are preceded by a slow decrease in the film tension and after the second fault the slave motor has a speed of zero. On the right, the fault observed on 30-06-2021 is preceded by a sudden acceleration of the platform rotation speed, while the slave motor does not move, but the film tension remains constant.
Figure 3: Distribution of the duration of work sessions.
Figure 4: Examples of patterns preceding a fault in a blood refrigerator, observed on 22-11-2022. Before the fault (shown by a dashed line), the instant power consumption increases and then decreases after a short time, the evaporator temperature decreases of $\approx 40 ^\circ C$ after a sudden increase, and the product temperature has an increase.
Figure 5: A typical pattern in a nitrogen generator, observed on 01-08-2023. In this case, before a fault (indicated with a gray dashed line), the oxygen concentration drops, the CMS air pressure decreases of of $\approx 8$ bar, and the nitrogen pressure decreases of $\approx 4$ bar, making the generation of nitrogen impossible.
...and 6 more figures

Predicting machine failures from multivariate time series: an industrial case study

TL;DR

Abstract

Predicting machine failures from multivariate time series: an industrial case study

Authors

TL;DR

Abstract

Table of Contents

Figures (11)