Table of Contents
Fetching ...

Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing

Huanbai Liu, Fanlong Zhang, Yin Tan, Lian Huang, Yan Li, Guoheng Huang, Shenghong Luo, An Zeng

TL;DR

This work targets robust bearing fault diagnosis under noise and domain shifts by introducing MQCCAF, a lightweight end-to-end model that integrates a multi-scale quaternion CNN (MQCNN) with a cross self-attention feature fusion (CSAFF) and a BiGRU classifier. MQCNN extracts global and multi-scale quaternion features from raw vibration signals, while CSAFF selectively fuses these features across scales to reduce redundancy and emphasize discriminative regions. The BiGRU-based classifier captures temporal dependencies, yielding state-of-the-art accuracy on CWRU (up to 99.99%), MFPT (100%), and Ottawa (99.21%), with strong anti-noise performance and cross-domain transfer capabilities. The approach offers a practical, robust, and efficient solution for real-time fault diagnosis in varied loading and noisy industrial environments.

Abstract

In recent years, deep learning has led to significant advances in bearing fault diagnosis (FD). Most techniques aim to achieve greater accuracy. However, they are sensitive to noise and lack robustness, resulting in insufficient domain adaptation and anti-noise ability. The comparison of studies reveals that giving equal attention to all features does not differentiate their significance. In this work, we propose a novel FD model by integrating multi-scale quaternion convolutional neural network (MQCNN), bidirectional gated recurrent unit (BiGRU), and cross self-attention feature fusion (CSAFF). We have developed innovative designs in two modules, namely MQCNN and CSAFF. Firstly, MQCNN applies quaternion convolution to multi-scale architecture for the first time, aiming to extract the rich hidden features of the original signal from multiple scales. Then, the extracted multi-scale information is input into CSAFF for feature fusion, where CSAFF innovatively incorporates cross self-attention mechanism to enhance discriminative interaction representation within features. Finally, BiGRU captures temporal dependencies while a softmax layer is employed for fault classification, achieving accurate FD. To assess the efficacy of our approach, we experiment on three public datasets (CWRU, MFPT, and Ottawa) and compare it with other excellent methods. The results confirm its state-of-the-art, which the average accuracies can achieve up to 99.99%, 100%, and 99.21% on CWRU, MFPT, and Ottawa datasets. Moreover, we perform practical tests and ablation experiments to validate the efficacy and robustness of the proposed approach. Code is available at https://github.com/mubai011/MQCCAF.

Multi-scale Quaternion CNN and BiGRU with Cross Self-attention Feature Fusion for Fault Diagnosis of Bearing

TL;DR

This work targets robust bearing fault diagnosis under noise and domain shifts by introducing MQCCAF, a lightweight end-to-end model that integrates a multi-scale quaternion CNN (MQCNN) with a cross self-attention feature fusion (CSAFF) and a BiGRU classifier. MQCNN extracts global and multi-scale quaternion features from raw vibration signals, while CSAFF selectively fuses these features across scales to reduce redundancy and emphasize discriminative regions. The BiGRU-based classifier captures temporal dependencies, yielding state-of-the-art accuracy on CWRU (up to 99.99%), MFPT (100%), and Ottawa (99.21%), with strong anti-noise performance and cross-domain transfer capabilities. The approach offers a practical, robust, and efficient solution for real-time fault diagnosis in varied loading and noisy industrial environments.

Abstract

In recent years, deep learning has led to significant advances in bearing fault diagnosis (FD). Most techniques aim to achieve greater accuracy. However, they are sensitive to noise and lack robustness, resulting in insufficient domain adaptation and anti-noise ability. The comparison of studies reveals that giving equal attention to all features does not differentiate their significance. In this work, we propose a novel FD model by integrating multi-scale quaternion convolutional neural network (MQCNN), bidirectional gated recurrent unit (BiGRU), and cross self-attention feature fusion (CSAFF). We have developed innovative designs in two modules, namely MQCNN and CSAFF. Firstly, MQCNN applies quaternion convolution to multi-scale architecture for the first time, aiming to extract the rich hidden features of the original signal from multiple scales. Then, the extracted multi-scale information is input into CSAFF for feature fusion, where CSAFF innovatively incorporates cross self-attention mechanism to enhance discriminative interaction representation within features. Finally, BiGRU captures temporal dependencies while a softmax layer is employed for fault classification, achieving accurate FD. To assess the efficacy of our approach, we experiment on three public datasets (CWRU, MFPT, and Ottawa) and compare it with other excellent methods. The results confirm its state-of-the-art, which the average accuracies can achieve up to 99.99%, 100%, and 99.21% on CWRU, MFPT, and Ottawa datasets. Moreover, we perform practical tests and ablation experiments to validate the efficacy and robustness of the proposed approach. Code is available at https://github.com/mubai011/MQCCAF.
Paper Structure (25 sections, 14 equations, 4 figures, 11 tables)

This paper contains 25 sections, 14 equations, 4 figures, 11 tables.

Figures (4)

  • Figure 1: Framework diagram of the proposed MQCCAF.
  • Figure 2: The flow of feature extraction for the quaternion convolution module(QCNN) is shown in the yellow box, where from bottom to top are the quaternion convolutional layer, quaternion normalization layer, activation layer, and Max pooling layer. The specific implementation of the quaternion convolution layer is shown in the orange box, where green, blue, and orange represent the quaternion components of the input signal, the filtering matrix, and the output of the convolution layer respectively, and the different colored lines in the figure represent the quaternion convolution of different components of the filtering matrix with the input signal to obtain the output vector.
  • Figure 4: Confusion matrix effect of different methods on CWRU dataset (a) WDCNN, (b) MSCNN, (c) CNNs-LSTM, (d) DCA-BiGRU, (e) QCNN, (f) MQCCAF (ours).
  • Figure 5: Confusion matrix effect of different methods on Ottawa dataset (a) WDCNN, (b) MSCNN, (c) CNNs-LSTM, (d) DCA-BiGRU, (e) QCNN, (f) MQCCAF (ours).