Table of Contents
Fetching ...

Lab-scale Vibration Analysis Dataset and Baseline Methods for Machinery Fault Diagnosis with Machine Learning

Bagus Tris Atmaja, Haris Ihsannur, Suyanto, Dhany Arifianto

TL;DR

This work tackles the need for accessible, labeled vibration data for machinery fault diagnosis by introducing the lab-scale VBL-VA001 dataset, covering four machine conditions with 4000 samples in CSV form. It establishes a simple yet effective baseline by extracting nine frequency-domain features per axis (27 features total) from FFT-transformed signals and evaluating SVM, KNN, and GNB classifiers. The results show SVM with an RBF kernel achieves near-perfect performance, achieving 99.75% weighted accuracy in 5-fold cross-validation and a perfect 1-fold test, demonstrating the dataset's value for benchmarking and ML-based fault detection. The dataset is openly available, enabling reproducible research and future improvements in robustness and generalization beyond the lab setting.

Abstract

The monitoring of machine conditions in a plant is crucial for production in manufacturing. A sudden failure of a machine can stop production and cause a loss of revenue. The vibration signal of a machine is a good indicator of its condition. This paper presents a dataset of vibration signals from a lab-scale machine. The dataset contains four different types of machine conditions: normal, unbalance, misalignment, and bearing fault. Three machine learning methods (SVM, KNN, and GNB) evaluated the dataset, and a perfect result was obtained by one of the methods on a 1-fold test. The performance of the algorithms is evaluated using weighted accuracy (WA) since the data is balanced. The results show that the best-performing algorithm is the SVM with a WA of 99.75\% on the 5-fold cross-validations. The dataset is provided in the form of CSV files in an open and free repository at https://zenodo.org/record/7006575.

Lab-scale Vibration Analysis Dataset and Baseline Methods for Machinery Fault Diagnosis with Machine Learning

TL;DR

This work tackles the need for accessible, labeled vibration data for machinery fault diagnosis by introducing the lab-scale VBL-VA001 dataset, covering four machine conditions with 4000 samples in CSV form. It establishes a simple yet effective baseline by extracting nine frequency-domain features per axis (27 features total) from FFT-transformed signals and evaluating SVM, KNN, and GNB classifiers. The results show SVM with an RBF kernel achieves near-perfect performance, achieving 99.75% weighted accuracy in 5-fold cross-validation and a perfect 1-fold test, demonstrating the dataset's value for benchmarking and ML-based fault detection. The dataset is openly available, enabling reproducible research and future improvements in robustness and generalization beyond the lab setting.

Abstract

The monitoring of machine conditions in a plant is crucial for production in manufacturing. A sudden failure of a machine can stop production and cause a loss of revenue. The vibration signal of a machine is a good indicator of its condition. This paper presents a dataset of vibration signals from a lab-scale machine. The dataset contains four different types of machine conditions: normal, unbalance, misalignment, and bearing fault. Three machine learning methods (SVM, KNN, and GNB) evaluated the dataset, and a perfect result was obtained by one of the methods on a 1-fold test. The performance of the algorithms is evaluated using weighted accuracy (WA) since the data is balanced. The results show that the best-performing algorithm is the SVM with a WA of 99.75\% on the 5-fold cross-validations. The dataset is provided in the form of CSV files in an open and free repository at https://zenodo.org/record/7006575.
Paper Structure (18 sections, 1 equation, 10 figures, 5 tables)

This paper contains 18 sections, 1 equation, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Five electrical machines with different fault conditions: (1) Misalignment, (2) Normal, (3) Unbalance 27 gram.cm, (4) Bearing fault, and (5) Unbalance 6 gram.cm.
  • Figure 2: Condition for unbalance and misalignment: (a) adding 18 grams of mass, (b) adding 4 grams of mass, (c) coupling shaft with an additional metal cylinder.
  • Figure 3: Sensor placement for vibration measurements
  • Figure 4: Flowchart of processing the vibration data with machine learning methods; the filtering process in pre-processing remove NaN (not-a-number) values; each feature in the feature extraction process has three values (x, y, z); hence, the total feature (feature fusion) has 27-dim (9 features $\times$ 3 axes).)
  • Figure 5: Spectrum of vibration signal in each machine condition: (a) normal, (b) unbalance, (c) misalignment, (d) bearing fault
  • ...and 5 more figures