On the Tunability of Random Survival Forests Model for Predictive Maintenance
Yigitcan Yardımcı, Mustafa Cavus
TL;DR
This work addresses the tunability of Random Survival Forests for time-to-failure prediction in predictive maintenance. It introduces a three-level framework to quantify tunability, using dataset-level gains and hyperparameter-level contributions, with performance evaluated via $C$-index and Brier score across CMAPSS subsets. The study finds an average C-index increase of $0.0547$ and a Brier score reduction of $0.0199$, with ntree and mtry driving discrimination improvements and nodesize enhancing calibration; splitrule may reduce performance if not carefully tuned. The results provide actionable guidance for prioritizing hyperparameter tuning in RSF for real-world maintenance applications and suggest directions for extending tunability analyses to broader datasets and optimization strategies.
Abstract
This paper investigates the tunability of the Random Survival Forest (RSF) model in predictive maintenance, where accurate time-to-failure estimation is crucial. Although RSF is widely used due to its flexibility and ability to handle censored data, its performance is sensitive to hyperparameter configurations. However, systematic evaluations of RSF tunability remain limited, especially in predictive maintenance contexts. We introduce a three-level framework to quantify tunability: (1) a model-level metric measuring overall performance gain from tuning, (2) a hyperparameter-level metric assessing individual contributions, and (3) identification of optimal tuning ranges. These metrics are evaluated across multiple datasets using survival-specific criteria: the C-index for discrimination and the Brier score for calibration. Experiments on four CMAPSS dataset subsets, simulating aircraft engine degradation, reveal that hyperparameter tuning consistently improves model performance. On average, the C-index increased by 0.0547, while the Brier score decreased by 0.0199. These gains were consistent across all subsets. Moreover, ntree and mtry showed the highest average tunability, while nodesize offered stable improvements within the range of 10 to 30. In contrast, splitrule demonstrated negative tunability on average, indicating that improper tuning may reduce model performance. Our findings emphasize the practical importance of hyperparameter tuning in survival models and provide actionable insights for optimizing RSF in real-world predictive maintenance applications.
