Table of Contents
Fetching ...

Interpretable Graph Neural Networks for Tabular Data

Amr Alkhatib, Sofiane Ennadir, Henrik Boström, Michalis Vazirgiannis

TL;DR

The paper addresses the need for trustworthy predictions on tabular data by introducing IGNNet, an interpretable Graph Neural Network that represents each data point as a feature-graph and constrains message passing to preserve transparency. IGNNet achieves competitive predictive performance with state-of-the-art tabular learners while providing exact, feature-level explanations that align with true Shapley values without additional computation. The work demonstrates, through large-scale experiments and explanation analyses, that the explanations converge toward KernelSHAP values and that the approach maintains strong performance across 35 datasets. The findings suggest IGNNet as a practical framework for transparent decision-making in domains demanding interpretability, with future directions including non-linear feature interactions and broader applicability beyond tabular data.

Abstract

Data in tabular format is frequently occurring in real-world applications. Graph Neural Networks (GNNs) have recently been extended to effectively handle such data, allowing feature interactions to be captured through representation learning. However, these approaches essentially produce black-box models, in the form of deep neural networks, precluding users from following the logic behind the model predictions. We propose an approach, called IGNNet (Interpretable Graph Neural Network for tabular data), which constrains the learning algorithm to produce an interpretable model, where the model shows how the predictions are exactly computed from the original input features. A large-scale empirical investigation is presented, showing that IGNNet is performing on par with state-of-the-art machine-learning algorithms that target tabular data, including XGBoost, Random Forests, and TabNet. At the same time, the results show that the explanations obtained from IGNNet are aligned with the true Shapley values of the features without incurring any additional computational overhead.

Interpretable Graph Neural Networks for Tabular Data

TL;DR

The paper addresses the need for trustworthy predictions on tabular data by introducing IGNNet, an interpretable Graph Neural Network that represents each data point as a feature-graph and constrains message passing to preserve transparency. IGNNet achieves competitive predictive performance with state-of-the-art tabular learners while providing exact, feature-level explanations that align with true Shapley values without additional computation. The work demonstrates, through large-scale experiments and explanation analyses, that the explanations converge toward KernelSHAP values and that the approach maintains strong performance across 35 datasets. The findings suggest IGNNet as a practical framework for transparent decision-making in domains demanding interpretability, with future directions including non-linear feature interactions and broader applicability beyond tabular data.

Abstract

Data in tabular format is frequently occurring in real-world applications. Graph Neural Networks (GNNs) have recently been extended to effectively handle such data, allowing feature interactions to be captured through representation learning. However, these approaches essentially produce black-box models, in the form of deep neural networks, precluding users from following the logic behind the model predictions. We propose an approach, called IGNNet (Interpretable Graph Neural Network for tabular data), which constrains the learning algorithm to produce an interpretable model, where the model shows how the predictions are exactly computed from the original input features. A large-scale empirical investigation is presented, showing that IGNNet is performing on par with state-of-the-art machine-learning algorithms that target tabular data, including XGBoost, Random Forests, and TabNet. At the same time, the results show that the explanations obtained from IGNNet are aligned with the true Shapley values of the features without incurring any additional computational overhead.
Paper Structure (24 sections, 3 equations, 12 figures, 4 tables, 1 algorithm)

This paper contains 24 sections, 3 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: An overview of our proposed approach. Each data instance is represented as a graph by embedding the feature values into a higher dimensionality, and the edge between two features (nodes) is the correlation value. Multiple iterations of message passing are then applied. Finally, the learned node representation is projected into a single value, and a whole graph representation is obtained by concatenating the projected values.
  • Figure 2: IGNNet default architecture. It starts with a layer to project the features into higher dimensionality, a linear transformation from one dimension to 64 dimensions. A Relu activation function follows each message-passing layer and each green block as well. The feedforward network at the end has no activation functions between layers to ensure a linear transformation into a single value. A sigmoid activation function follows the feedforward network to obtain the final value for each feature between 0 and 1.
  • Figure 3: Comparison of KernelSHAP's approximations and the importance scores obtained from IGNNet. We measure the similarity of KernelSHAP's approximations to the scores of IGNNet at each iteration of data sampling and evaluation of KernelSHAP. KernelSHAP exhibits improvement in approximating the scores derived from IGNNet with more data sampling.
  • Figure 4: Explanation to a single prediction on Adult dataset.
  • Figure 5: The average rank of the compared classifiers on the 35 datasets with respect to the AUC (a lower rank is better), where the critical difference (CD) represents the largest difference that is not statistically significant.
  • ...and 7 more figures