Table of Contents
Fetching ...

Predictive Maintenance Study for High-Pressure Industrial Compressors: Hybrid Clustering Models

Alessandro Costa, Emilio Mastriani, Federico Incardona, Kevin Munari, Sebastiano Spinello

Abstract

This study introduces a predictive maintenance strategy for high pressure industrial compressors using sensor data and features derived from unsupervised clustering integrated into classification models. The goal is to enhance model accuracy and efficiency in detecting compressor failures. After data pre processing, sensitive clustering parameters were tuned to identify algorithms that best capture the dataset's temporal and operational characteristics. Clustering algorithms were evaluated using quality metrics like Normalized Mutual Information (NMI) and Adjusted Rand Index (ARI), selecting those most effective at distinguishing between normal and non normal conditions. These features enriched regression models, improving failure detection accuracy by 4.87 percent on average. Although training time was reduced by 22.96 percent, the decrease was not statistically significant, varying across algorithms. Cross validation and key performance metrics confirmed the benefits of clustering based features in predictive maintenance models.

Predictive Maintenance Study for High-Pressure Industrial Compressors: Hybrid Clustering Models

Abstract

This study introduces a predictive maintenance strategy for high pressure industrial compressors using sensor data and features derived from unsupervised clustering integrated into classification models. The goal is to enhance model accuracy and efficiency in detecting compressor failures. After data pre processing, sensitive clustering parameters were tuned to identify algorithms that best capture the dataset's temporal and operational characteristics. Clustering algorithms were evaluated using quality metrics like Normalized Mutual Information (NMI) and Adjusted Rand Index (ARI), selecting those most effective at distinguishing between normal and non normal conditions. These features enriched regression models, improving failure detection accuracy by 4.87 percent on average. Although training time was reduced by 22.96 percent, the decrease was not statistically significant, varying across algorithms. Cross validation and key performance metrics confirmed the benefits of clustering based features in predictive maintenance models.

Paper Structure

This paper contains 13 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Flowchart of the main process. The process consists of five tasks: preprocessing the dataset, determining optimal clustering parameters, performing and evaluating clustering, applying classification algorithms, and validating with metrics.
  • Figure 2: Correlation matrix. The heatmap illustrates the correlation between the features in the data frame. It was generated using the corr() function in pandas.
  • Figure 3: P-values for features. In the scatter-plot, the features with a p-value less than 0.05 are labeled, with the threshold indicated by the green dotted line.
  • Figure 4: The distance to each point's k-th nearest neighbors is averaged and plotted in ascending order. The optimal epsilon value corresponds to the point of maximum curvature on the graph.
  • Figure 5: Silhouette evaluation: A. Silhouette Score vs Epsilon (DBSCAN) shows how clustering quality varies with different epsilon values in a density-based algorithm. B. Silhouette Score vs Number of Clusters (KMeans) illustrates how clustering quality changes with varying k values in a distance-based algorithm.
  • ...and 4 more figures