FTT-GRU: A Hybrid Fast Temporal Transformer with GRU for Remaining Useful Life Prediction
Varun Teja Chirukiri, Udaya Bhasker Cheerala, Sandeep Kanta, Abdul Karim, Praveen Damacharla
TL;DR
Problem: Predict remaining useful life (RUL) from multivariate sensor data. Approach: a hybrid FTT-GRU combining a Fast Temporal Transformer with a GRU to model global and local temporal dynamics. Key results: on CMAPSS FD001, RMSE = 30.76, MAE = 18.97, and $R^2 = 0.453$, with 1.12 ms latency, outperforming the TCN–Attention baseline. Significance: supports real-time, edge-friendly prognostics with interpretability options and a clear path toward broader generalization via domain adaptation.
Abstract
Accurate prediction of the remaining useful life (RUL) of industrial machinery is essential for reducing downtime and optimizing maintenance schedules. Existing approaches, such as long short-term memory (LSTM) networks and convolutional neural networks (CNNs), often struggle to model both global temporal dependencies and fine-grained degradation trends in multivariate sensor data. We propose a hybrid model, FTT-GRU, which combines a Fast Temporal Transformer (FTT) -- a lightweight Transformer variant using linearized attention via fast Fourier transform (FFT) -- with a gated recurrent unit (GRU) layer for sequential modeling. To the best of our knowledge, this is the first application of an FTT with a GRU for RUL prediction on NASA CMAPSS, enabling simultaneous capture of global and local degradation patterns in a compact architecture. On CMAPSS FD001, FTT-GRU attains RMSE 30.76, MAE 18.97, and $R^2=0.45$, with 1.12 ms CPU latency at batch=1. Relative to the best published deep baseline (TCN--Attention), it improves RMSE by 1.16\% and MAE by 4.00\%. Training curves averaged over $k=3$ runs show smooth convergence with narrow 95\% confidence bands, and ablations (GRU-only, FTT-only) support the contribution of both components. These results demonstrate that a compact Transformer-RNN hybrid delivers accurate and efficient RUL predictions on CMAPSS, making it suitable for real-time industrial prognostics.
