Table of Contents
Fetching ...

Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution

Alexander Hinterleitner, Thomas Bartz-Beielstein, Richard Schulz, Sebastian Spengler, Thomas Winter, Christoph Leitenmeier

TL;DR

This study formalizes a regression-specific explainable AI workflow that uses Integrated Gradients to assign feature importance and a subsequent $k$-means clustering step to curate a compact, informative input subset. By validating on blade vibration data from turbomachinery, the authors show that removing low-importance features improves prediction stability and, in some cases, accuracy, outperforming classical feature selection baselines. The combination of attribution-driven pruning with surrogate-based hyperparameter tuning yields a robust regression model that retains physical interpretability, highlighting which process and operating conditions most influence vibration amplitudes. The work suggests practical benefits for turbine design and maintenance by enabling data-driven feature selection that respects model explanations while delivering better predictive performance.

Abstract

Research in Explainable Artificial Intelligence (XAI) is increasing, aiming to make deep learning models more transparent. Most XAI methods focus on justifying the decisions made by Artificial Intelligence (AI) systems in security-relevant applications. However, relatively little attention has been given to using these methods to improve the performance and robustness of deep learning algorithms. Additionally, much of the existing XAI work primarily addresses classification problems. In this study, we investigate the potential of feature attribution methods to filter out uninformative features in input data for regression problems, thereby improving the accuracy and stability of predictions. We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space. To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.

Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution

TL;DR

This study formalizes a regression-specific explainable AI workflow that uses Integrated Gradients to assign feature importance and a subsequent -means clustering step to curate a compact, informative input subset. By validating on blade vibration data from turbomachinery, the authors show that removing low-importance features improves prediction stability and, in some cases, accuracy, outperforming classical feature selection baselines. The combination of attribution-driven pruning with surrogate-based hyperparameter tuning yields a robust regression model that retains physical interpretability, highlighting which process and operating conditions most influence vibration amplitudes. The work suggests practical benefits for turbine design and maintenance by enabling data-driven feature selection that respects model explanations while delivering better predictive performance.

Abstract

Research in Explainable Artificial Intelligence (XAI) is increasing, aiming to make deep learning models more transparent. Most XAI methods focus on justifying the decisions made by Artificial Intelligence (AI) systems in security-relevant applications. However, relatively little attention has been given to using these methods to improve the performance and robustness of deep learning algorithms. Additionally, much of the existing XAI work primarily addresses classification problems. In this study, we investigate the potential of feature attribution methods to filter out uninformative features in input data for regression problems, thereby improving the accuracy and stability of predictions. We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space. To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.
Paper Structure (27 sections, 2 equations, 7 figures)

This paper contains 27 sections, 2 equations, 7 figures.

Figures (7)

  • Figure 1: Illustrative visualization of the turbine wheel (1) and compressor wheel (2) in a turbocharging system. Critical vibrations occur at the blades mounted on the wheels. (MAN Energy Solutions, 2023)
  • Figure 2: Process for feature selection based on feature attribution and clustering. The process starts with tuning a neural network using all available features. The tuned network is then analyzed using integrated gradients to calculate attribution values. Next, these attribution values are clustered with varying k values. By removing the least important cluster from each clustering, different new feature sets are created. A neural network is subsequently tuned for each of these feature sets. The results are compared with each other and with the network based on the initial feature set.
  • Figure 3: Performance of the tuned neural network on the dummy data. The left side shows the comparison between actual and predicted values. The right graph shows the residuals between of the predicted values. The graphs indicate that the network accurately captures the behavior of the dummy data.
  • Figure 4: Comparison between scaled IG feature attribution values and the coefficients of the linear model used to generate the dummy data. The attribution values are scaled between the minimum and maximum values of the coefficients to ensure the same ranges for the comparison. It can be seen that coefficients and the attribution values follow a similiar pattern.
  • Figure 5: K-Means clustering for the Integrated Gradient based feature attribution values. The attribution values were calculated for a tuned neural network across all 86 input features. This visualization exemplifies the clustering process for K-Means with $k = 6$. The points on and below the red dotted line correspond to the cluster with the lowest attribution values, which are removed during the feature selection process.
  • ...and 2 more figures