Graph Neural Network contextual embedding for Deep Learning on Tabular Data

Mario Villaizán-Vallelado; Matteo Salvatori; Belén Carro Martinez; Antonio Javier Sanchez Esguevillas

Graph Neural Network contextual embedding for Deep Learning on Tabular Data

Mario Villaizán-Vallelado, Matteo Salvatori, Belén Carro Martinez, Antonio Javier Sanchez Esguevillas

TL;DR

This paper introduces INCE, a Graph Neural Network-based contextual embedding approach for tabular data, addressing heterogeneity and arbitrary feature ordering by modeling interactions among features as a fully-connected feature graph with a learned $cls$ token. Features are projected into a common latent space and refined through a stack of Interaction Network layers, yielding a contextual embedding used by a decoder for supervised tasks. Across five public tabular datasets, INCE outperforms DL benchmarks and remains competitive with boosted-tree methods, while providing interpretable insights into feature interactions via edge updates and Mahalanobis-based significance analysis. The work demonstrates that GNN-based contextual embeddings are a compelling, parameter-efficient alternative to Transformer-based methods for tabular data, with practical implications for AI systems requiring both accuracy and interpretability.

Abstract

All industries are trying to leverage Artificial Intelligence (AI) based on their existing big data which is available in so called tabular form, where each record is composed of a number of heterogeneous continuous and categorical columns also known as features. Deep Learning (DL) has constituted a major breakthrough for AI in fields related to human skills like natural language processing, but its applicability to tabular data has been more challenging. More classical Machine Learning (ML) models like tree-based ensemble ones usually perform better. This paper presents a novel DL model using Graph Neural Network (GNN) more specifically Interaction Network (IN), for contextual embedding and modelling interactions among tabular features. Its results outperform those of a recently published survey with DL benchmark based on five public datasets, also achieving competitive results when compared to boosted-tree solutions.

Graph Neural Network contextual embedding for Deep Learning on Tabular Data

TL;DR

token. Features are projected into a common latent space and refined through a stack of Interaction Network layers, yielding a contextual embedding used by a decoder for supervised tasks. Across five public tabular datasets, INCE outperforms DL benchmarks and remains competitive with boosted-tree methods, while providing interpretable insights into feature interactions via edge updates and Mahalanobis-based significance analysis. The work demonstrates that GNN-based contextual embeddings are a compelling, parameter-efficient alternative to Transformer-based methods for tabular data, with practical implications for AI systems requiring both accuracy and interpretability.

Abstract

Paper Structure (12 sections, 13 equations, 11 figures, 4 tables, 1 algorithm)

This paper contains 12 sections, 13 equations, 11 figures, 4 tables, 1 algorithm.

Introduction
Related Work
ince
Experiments
Results
Deep Dive in in
in vs. Transformer
Interpretability of contextual embedding
Columnar vs. Contextual embedding
Feature importance from Feature-Feature interaction
Conclusions
Normalized Metric

Figures (11)

Figure 1: The encoder-decoder perspective hamilton2020graph: an encoder model maps each tabular dataset feature into a latent vector, a decoder model uses the embeddings to solve the supervised learning task. In the encoding step, first a columnar embedding individually projects any feature in a common latent space and then a contextual embedding improves these representations taking into account the relationships among features. The decoder mlp transforms the contextual embedding output in the final model prediction.
Figure 2: The columnar embedding is responsible for projecting all the heterogeneous features in the tabular dataset in a common latent space. For each feature, a continuous or categorical transformation is defined. The columnar embedding ignores any potential relationship or similarity between the tabular dataset features.
Figure 3: Contextual embedding. (a) Homogeneous and fully-connected graph: it contains a node for each initial tabular features and a bidirectional-edge for each pair of nodes. The initial node representation is obtained by the columnar embedding. A virtual cls node is introduced to characterize the global graph state. (b) A stack of in battaglia2016interaction models node interactions to create a more accurate representation of nodes (i.e. tabular features). (c) The final representation of the cls virtual node is used as contextual embedding.
Figure 4: in layer
Figure 5: The stripplot in blue and orange illustrate the distribution of tree-based and dl baseline, respectively. The horizontal dotted line represents the ince performance. Accuracy and mse are the metrics used for classification and regression tasks. The presence of an up/down arrow near the dataset name indicates whether the metric must be maximized o minimized.
...and 6 more figures

Graph Neural Network contextual embedding for Deep Learning on Tabular Data

TL;DR

Abstract

Graph Neural Network contextual embedding for Deep Learning on Tabular Data

Authors

TL;DR

Abstract

Table of Contents

Figures (11)