Table of Contents
Fetching ...

Finding One's Bearings in the Hyperparameter Landscape of a Wide-Kernel Convolutional Fault Detector

Dan Hudson, Jurgen van den Hoogen, Martin Atzmueller

TL;DR

This study investigates how architectural and training hyperparameters shape bearing fault detection performance across diverse neural architectures and datasets. By combining grid searches, data manipulations (resampling and filtering), and cross-architecture comparisons (LSTM, Transformer, wide-kernel CNN), the authors show that hyperparameters strongly influence accuracy and that optimal settings vary with data properties. They introduce the concept of multiple defaults to efficiently adapt to new data and demonstrate that high-frequency content alone does not explain the superiority of wide kernels. The findings yield practical tuning guidance for deploying fault detectors in real-world, data-shifting scenarios and highlight the need for dataset-aware hyperparameter strategies in time-series fault detection.

Abstract

State-of-the-art algorithms are reported to be almost perfect at distinguishing the vibrations arising from healthy and damaged machine bearings, according to benchmark datasets at least. However, what about their application to new data? In this paper, we confirm that neural networks for bearing fault detection can be crippled by incorrect hyperparameterisation, and also that the correct hyperparameter settings can change when transitioning to new data. The paper combines multiple methods to explain the behaviour of the hyperparameters of a wide-kernel convolutional neural network and how to set them. Since guidance already exists for generic hyperparameters like minibatch size, we focus on how to set architecture-specific hyperparameters such as the width of the convolutional kernels, a topic which might otherwise be obscure. We reflect different data properties by fusing information from seven different benchmark datasets, and our results show that the kernel size in the first layer in particular is sensitive to changes in the data. Looking deeper, we use manipulated copies of one dataset in an attempt to spot why the kernel size sometimes needs to change. The relevance of sampling rate is studied by using different levels of resampling, and spectral content is studied by increasingly filtering out high frequencies. We find that, contrary to speculation in earlier work, high-frequency noise is not the main reason why a wide kernel is preferable to a narrow kernel. Finally, we conclude by stating clear guidance on how to set the hyperparameters of our neural network architecture to work effectively on new data.

Finding One's Bearings in the Hyperparameter Landscape of a Wide-Kernel Convolutional Fault Detector

TL;DR

This study investigates how architectural and training hyperparameters shape bearing fault detection performance across diverse neural architectures and datasets. By combining grid searches, data manipulations (resampling and filtering), and cross-architecture comparisons (LSTM, Transformer, wide-kernel CNN), the authors show that hyperparameters strongly influence accuracy and that optimal settings vary with data properties. They introduce the concept of multiple defaults to efficiently adapt to new data and demonstrate that high-frequency content alone does not explain the superiority of wide kernels. The findings yield practical tuning guidance for deploying fault detectors in real-world, data-shifting scenarios and highlight the need for dataset-aware hyperparameter strategies in time-series fault detection.

Abstract

State-of-the-art algorithms are reported to be almost perfect at distinguishing the vibrations arising from healthy and damaged machine bearings, according to benchmark datasets at least. However, what about their application to new data? In this paper, we confirm that neural networks for bearing fault detection can be crippled by incorrect hyperparameterisation, and also that the correct hyperparameter settings can change when transitioning to new data. The paper combines multiple methods to explain the behaviour of the hyperparameters of a wide-kernel convolutional neural network and how to set them. Since guidance already exists for generic hyperparameters like minibatch size, we focus on how to set architecture-specific hyperparameters such as the width of the convolutional kernels, a topic which might otherwise be obscure. We reflect different data properties by fusing information from seven different benchmark datasets, and our results show that the kernel size in the first layer in particular is sensitive to changes in the data. Looking deeper, we use manipulated copies of one dataset in an attempt to spot why the kernel size sometimes needs to change. The relevance of sampling rate is studied by using different levels of resampling, and spectral content is studied by increasingly filtering out high frequencies. We find that, contrary to speculation in earlier work, high-frequency noise is not the main reason why a wide kernel is preferable to a narrow kernel. Finally, we conclude by stating clear guidance on how to set the hyperparameters of our neural network architecture to work effectively on new data.

Paper Structure

This paper contains 52 sections, 2 equations, 10 figures, 8 tables, 1 algorithm.

Figures (10)

  • Figure 1: Workflow of the method to test out multiple architectures on varying data.
  • Figure 2: The likelihood that tuning one hyperparameter (source of an arrow) will cause another (destination of an arrow) to need re-tuning.
  • Figure 3: Expected performance relative to other hyperparameter settings, after trying multiple defaults. Calculated by taking the best-performing CNN tried so far per benchmark, then averaging the quantile of how well those CNNs did.
  • Figure 4: Correlation between the same CNN configurations in terms of performance when using different resampling conditions, starting from 48 kHz.
  • Figure 5: Accuracy scores for different kernel sizes in the first convolutional layer, as tested on different resampling conditions for data originally sampled at 48 kHz.
  • ...and 5 more figures