Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution
Alexander Hinterleitner, Thomas Bartz-Beielstein, Richard Schulz, Sebastian Spengler, Thomas Winter, Christoph Leitenmeier
TL;DR
This study formalizes a regression-specific explainable AI workflow that uses Integrated Gradients to assign feature importance and a subsequent $k$-means clustering step to curate a compact, informative input subset. By validating on blade vibration data from turbomachinery, the authors show that removing low-importance features improves prediction stability and, in some cases, accuracy, outperforming classical feature selection baselines. The combination of attribution-driven pruning with surrogate-based hyperparameter tuning yields a robust regression model that retains physical interpretability, highlighting which process and operating conditions most influence vibration amplitudes. The work suggests practical benefits for turbine design and maintenance by enabling data-driven feature selection that respects model explanations while delivering better predictive performance.
Abstract
Research in Explainable Artificial Intelligence (XAI) is increasing, aiming to make deep learning models more transparent. Most XAI methods focus on justifying the decisions made by Artificial Intelligence (AI) systems in security-relevant applications. However, relatively little attention has been given to using these methods to improve the performance and robustness of deep learning algorithms. Additionally, much of the existing XAI work primarily addresses classification problems. In this study, we investigate the potential of feature attribution methods to filter out uninformative features in input data for regression problems, thereby improving the accuracy and stability of predictions. We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space. To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.
