Table of Contents
Fetching ...

Bringing Graphs to the Table: Zero-shot Node Classification via Tabular Foundation Models

Adrian Hayler, Xingyue Huang, İsmail İlkan Ceylan, Michael Bronstein, Ben Finkelshtein

TL;DR

This work tackles the limited cross-graph generalization of graph foundation models by reframing node classification as a tabular learning problem and introducing TAG. TAG converts graphs to tables using node-level feature and structure encoders, applies multiple tabular foundation models to subsampled tables, and aggregates predictions via ensemble selection, achieving a $\approx 7\%$ average improvement over state-of-the-art GFMs and task-specific GNNs across $28$ real-world datasets. It demonstrates zero-shot transfer with TabPFN, analyzes the importance of encoder depth and subsampling, and shows that light fine-tuning of TabPFN can yield small gains; the approach scales to large feature spaces and arbitrary class counts due to table subsampling and ECOC. Overall, TAG reveals a promising, scalable pathway for generalizable graph learning that leverages tabular priors without graph-specific pretraining, with potential extensions to other graph tasks and larger, more diverse datasets.

Abstract

Graph foundation models (GFMs) have recently emerged as a promising paradigm for achieving broad generalization across various graph data. However, existing GFMs are often trained on datasets that may not fully reflect real-world graphs, limiting their generalization performance. In contrast, tabular foundation models (TFMs) not only excel at classical tabular prediction tasks but have also shown strong applicability in other domains such as time series forecasting, natural language processing, and computer vision. Motivated by this, we take an alternative view to the standard perspective of GFMs and reformulate node classification as a tabular problem. In this reformulation, each node is represented as a row with feature, structure, and label information as columns, enabling TFMs to directly perform zero-shot node classification via in-context learning. In this work, we introduce TAG, a tabular approach for graph learning that first converts a graph into a table via feature and structural encoders, applies multiple TFMs to diversely subsampled tables, and then aggregates their outputs through ensemble selection. Experiments on 28 real-world datasets demonstrate that TAG consistently improves upon task-specific GNNs and state-of-the-art GFMs, highlighting the potential of the tabular reformulation for scalable and generalizable graph learning.

Bringing Graphs to the Table: Zero-shot Node Classification via Tabular Foundation Models

TL;DR

This work tackles the limited cross-graph generalization of graph foundation models by reframing node classification as a tabular learning problem and introducing TAG. TAG converts graphs to tables using node-level feature and structure encoders, applies multiple tabular foundation models to subsampled tables, and aggregates predictions via ensemble selection, achieving a average improvement over state-of-the-art GFMs and task-specific GNNs across real-world datasets. It demonstrates zero-shot transfer with TabPFN, analyzes the importance of encoder depth and subsampling, and shows that light fine-tuning of TabPFN can yield small gains; the approach scales to large feature spaces and arbitrary class counts due to table subsampling and ECOC. Overall, TAG reveals a promising, scalable pathway for generalizable graph learning that leverages tabular priors without graph-specific pretraining, with potential extensions to other graph tasks and larger, more diverse datasets.

Abstract

Graph foundation models (GFMs) have recently emerged as a promising paradigm for achieving broad generalization across various graph data. However, existing GFMs are often trained on datasets that may not fully reflect real-world graphs, limiting their generalization performance. In contrast, tabular foundation models (TFMs) not only excel at classical tabular prediction tasks but have also shown strong applicability in other domains such as time series forecasting, natural language processing, and computer vision. Motivated by this, we take an alternative view to the standard perspective of GFMs and reformulate node classification as a tabular problem. In this reformulation, each node is represented as a row with feature, structure, and label information as columns, enabling TFMs to directly perform zero-shot node classification via in-context learning. In this work, we introduce TAG, a tabular approach for graph learning that first converts a graph into a table via feature and structural encoders, applies multiple TFMs to diversely subsampled tables, and then aggregates their outputs through ensemble selection. Experiments on 28 real-world datasets demonstrate that TAG consistently improves upon task-specific GNNs and state-of-the-art GFMs, highlighting the potential of the tabular reformulation for scalable and generalizable graph learning.

Paper Structure

This paper contains 20 sections, 9 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: An illustration of TAG converting a graph into a tabular representation via feature and structural encoders.
  • Figure 2: Overall pipeline of TAG. Given a graph $G$ and a set of querying nodes $Q$, TAG first employs node-level encoding on $G$ and converts it into a table $T$. Then TAG constructs $\{b\}_{b=1}^B$ subsampled tables and applies TFM on them. Finally, TAG aggregates individual query scores $\hat{{\bm{Y}}}^{(b)}_Q$ via ensemble selection to produce the final prediction $\hat{{\bm{Y}}}_Q$.
  • Figure 2: Accuracy of pretrained vs. fine-tuned TAG on 20 unseen datasets.
  • Figure 3: Mean accuracy of TAG with varying feature encoder depth.
  • Figure 4: Average accuracy of TAG across 28 datasets vs. the number of subsampled tables.

Theorems & Definitions (2)

  • Remark 4.1
  • Remark 4.2