Table of Contents
Fetching ...

Learning Active Subspaces and Discovering Important Features with Gaussian Radial Basis Functions Neural Networks

Danny D'Agostino, Ilija Ilievski, Christine Annette Shoemaker

TL;DR

The paper addresses the interpretability of nonlinear predictive models on tabular data by proposing a Gaussian Radial Basis Function Neural Network (GRBFNN) with a learnable precision matrix $\mathbf{P}$ in the Gaussian kernel. By jointly learning $\mathbf{w}$ and the kernel parameters, and regularizing $\mathbf{P}$ via $\lambda_{\mathbf{u}}$ and $\lambda_{\mathbf{w}}$, the model enables extraction of an active subspace and a feature-importance ranking from the spectrum of $\mathbf{P}$ after training. The authors provide theory and practical methods for obtaining the active subspace directions (eigenvectors of $\mathbf{P}$) and a Jacobian-based feature-importance measure, demonstrate visualization in the active subspace, and show through extensive experiments that GRBFNN achieves competitive predictive performance while offering meaningful interpretability. The work includes two center-selection schemes (unsupervised and supervised), a comprehensive benchmark against standard ML models and deep embedding methods, and an analysis of regularization and dimensionality-reduction behavior, with public PyTorch code available for replication. Overall, the GRBFNN framework advances interpretable nonlinear modeling for tabular data and offers tools for supervised dimensionality reduction and feature selection in real-world applications such as healthcare and engineering optimization.

Abstract

Providing a model that achieves a strong predictive performance and is simultaneously interpretable by humans is one of the most difficult challenges in machine learning research due to the conflicting nature of these two objectives. To address this challenge, we propose a modification of the radial basis function neural network model by equipping its Gaussian kernel with a learnable precision matrix. We show that precious information is contained in the spectrum of the precision matrix that can be extracted once the training of the model is completed. In particular, the eigenvectors explain the directions of maximum sensitivity of the model revealing the active subspace and suggesting potential applications for supervised dimensionality reduction. At the same time, the eigenvectors highlight the relationship in terms of absolute variation between the input and the latent variables, thereby allowing us to extract a ranking of the input variables based on their importance to the prediction task enhancing the model interpretability. We conducted numerical experiments for regression, classification, and feature selection tasks, comparing our model against popular machine learning models, the state-of-the-art deep learning-based embedding feature selection techniques, and a transformer model for tabular data. Our results demonstrate that the proposed model does not only yield an attractive prediction performance compared to the competitors but also provides meaningful and interpretable results that potentially could assist the decision-making process in real-world applications. A PyTorch implementation of the model is available on GitHub at the following link. https://github.com/dannyzx/Gaussian-RBFNN

Learning Active Subspaces and Discovering Important Features with Gaussian Radial Basis Functions Neural Networks

TL;DR

The paper addresses the interpretability of nonlinear predictive models on tabular data by proposing a Gaussian Radial Basis Function Neural Network (GRBFNN) with a learnable precision matrix in the Gaussian kernel. By jointly learning and the kernel parameters, and regularizing via and , the model enables extraction of an active subspace and a feature-importance ranking from the spectrum of after training. The authors provide theory and practical methods for obtaining the active subspace directions (eigenvectors of ) and a Jacobian-based feature-importance measure, demonstrate visualization in the active subspace, and show through extensive experiments that GRBFNN achieves competitive predictive performance while offering meaningful interpretability. The work includes two center-selection schemes (unsupervised and supervised), a comprehensive benchmark against standard ML models and deep embedding methods, and an analysis of regularization and dimensionality-reduction behavior, with public PyTorch code available for replication. Overall, the GRBFNN framework advances interpretable nonlinear modeling for tabular data and offers tools for supervised dimensionality reduction and feature selection in real-world applications such as healthcare and engineering optimization.

Abstract

Providing a model that achieves a strong predictive performance and is simultaneously interpretable by humans is one of the most difficult challenges in machine learning research due to the conflicting nature of these two objectives. To address this challenge, we propose a modification of the radial basis function neural network model by equipping its Gaussian kernel with a learnable precision matrix. We show that precious information is contained in the spectrum of the precision matrix that can be extracted once the training of the model is completed. In particular, the eigenvectors explain the directions of maximum sensitivity of the model revealing the active subspace and suggesting potential applications for supervised dimensionality reduction. At the same time, the eigenvectors highlight the relationship in terms of absolute variation between the input and the latent variables, thereby allowing us to extract a ranking of the input variables based on their importance to the prediction task enhancing the model interpretability. We conducted numerical experiments for regression, classification, and feature selection tasks, comparing our model against popular machine learning models, the state-of-the-art deep learning-based embedding feature selection techniques, and a transformer model for tabular data. Our results demonstrate that the proposed model does not only yield an attractive prediction performance compared to the competitors but also provides meaningful and interpretable results that potentially could assist the decision-making process in real-world applications. A PyTorch implementation of the model is available on GitHub at the following link. https://github.com/dannyzx/Gaussian-RBFNN
Paper Structure (19 sections, 14 equations, 14 figures, 6 tables)

This paper contains 19 sections, 14 equations, 14 figures, 6 tables.

Figures (14)

  • Figure 1: The GRBFNN behavior is graphically represented in four subfigures: (a) shows the classification problem with purple and yellow dots representing the two classes. The subfigure (b) shows the fitted GRBFNN in the input space, while the (c) figure shows the fitted GRBFNN model in the active subspace. Contour levels show estimated class probabilities. The red dotted points represent the GRBFNN centers. The white arrow highlights the direction of the dominant eigenvector $\mathbf{v}_1$. Finally, in (c) the subfigure shows the feature importance estimated from the GRBFNN.
  • Figure 2: The GRBFNN behavior is graphically represented in four subfigures: (a) shows the classification problem with purple and yellow dots representing the two classes. The subfigure (b) shows the fitted GRBFNN in the input space, while the (c) figure shows the fitted GRBFNN model in the active subspace. Contour levels show estimated class probabilities. The red dotted points represent the GRBFNN centers. The white arrow highlights the direction of the dominant eigenvector $\mathbf{v}_1$. Finally, in (c) the subfigure shows the feature importance estimated from the GRBFNN and highlights its composition.
  • Figure 3: The GRBFNN behavior is depicted in four subfigures: (a) shows the regression problem with target function $t(\mathbf{x}) = \sin(0.5x_1 + 0.5x_2)$, while (b) displays the fitted GRBFNN in the input space. The dominant eigenvector $\mathbf{v}_1$ is indicated by a white arrow, and the GRBFNN centers are shown as red dotted points. The subfigure (d) shows the fitted GRBFNN model projected in the one-dimensional active subspace. The function values at the input data and at the centers are represented by black and red dotted points, respectively. Finally, in (c) the subfigure displays the feature importance estimated from the GRBFNN. Function values are normalized.
  • Figure 4: The GRBFNN behavior is depicted in four subfigures: (a) shows the regression problem with target function $t(\mathbf{x}) = \sin(0.1x_1 + 0.9x_2)$, while (b) displays the fitted GRBFNN in the input space. The dominant eigenvector $\mathbf{v}_1$ is indicated by a white arrow, and the GRBFNN centers are shown as red dotted points. The subfigure (d) shows the fitted GRBFNN model projected in the one-dimensional active subspace. The function values at the input data and at the centers are represented by black and red dotted points, respectively. Finally, in (c) the subfigure displays the feature importance estimated from the GRBFNN. Function values are normalized.
  • Figure 5: Graphical interpretation of the sensitivity analysis concerning the two regularizers $\lambda_{\mathbf{w}}$ and $\lambda_{\mathbf{u}}$ on the Breast Cancer dataset (binary classification) in (a) and on the Prostatic Cancer dataset (regression) in (b). The red frame highlights the best combination of hyperparameters. A lighter color indicates higher accuracy and higher RMSE.
  • ...and 9 more figures