Table of Contents
Fetching ...

TREE-G: Decision Trees Contesting Graph Neural Networks

Maya Bechler-Speicher, Amir Globerson, Ran Gilad-Bachrach

TL;DR

TREE-G presents a graph-tailored decision tree framework that reevaluates split decisions using graph structure and a dynamic subset mechanism. By propagating features through walks and focusing on ancestor-derived vertex subsets, TREE-G achieves greater expressivity than standard DTs and often outperforms both graph kernels and several GNNs. The approach delivers strong empirical results across graph- and vertex-labeling tasks and provides interpretable explanations based on subset usage. The combination of permutation invariance, scalable training, and explainability suggests practical impact for graph analytics without relying on deep neural networks.

Abstract

When dealing with tabular data, models based on decision trees are a popular choice due to their high accuracy on these data types, their ease of application, and explainability properties. However, when it comes to graph-structured data, it is not clear how to apply them effectively, in a way that incorporates the topological information with the tabular data available on the vertices of the graph. To address this challenge, we introduce TREE-G. TREE-G modifies standard decision trees, by introducing a novel split function that is specialized for graph data. Not only does this split function incorporate the node features and the topological information, but it also uses a novel pointer mechanism that allows split nodes to use information computed in previous splits. Therefore, the split function adapts to the predictive task and the graph at hand. We analyze the theoretical properties of TREE-G and demonstrate its benefits empirically on multiple graph and vertex prediction benchmarks. In these experiments, TREE-G consistently outperforms other tree-based models and often outperforms other graph-learning algorithms such as Graph Neural Networks (GNNs) and Graph Kernels, sometimes by large margins. Moreover, TREE-Gs models and their predictions can be explained and visualized

TREE-G: Decision Trees Contesting Graph Neural Networks

TL;DR

TREE-G presents a graph-tailored decision tree framework that reevaluates split decisions using graph structure and a dynamic subset mechanism. By propagating features through walks and focusing on ancestor-derived vertex subsets, TREE-G achieves greater expressivity than standard DTs and often outperforms both graph kernels and several GNNs. The approach delivers strong empirical results across graph- and vertex-labeling tasks and provides interpretable explanations based on subset usage. The combination of permutation invariance, scalable training, and explainability suggests practical impact for graph analytics without relying on deep neural networks.

Abstract

When dealing with tabular data, models based on decision trees are a popular choice due to their high accuracy on these data types, their ease of application, and explainability properties. However, when it comes to graph-structured data, it is not clear how to apply them effectively, in a way that incorporates the topological information with the tabular data available on the vertices of the graph. To address this challenge, we introduce TREE-G. TREE-G modifies standard decision trees, by introducing a novel split function that is specialized for graph data. Not only does this split function incorporate the node features and the topological information, but it also uses a novel pointer mechanism that allows split nodes to use information computed in previous splits. Therefore, the split function adapts to the predictive task and the graph at hand. We analyze the theoretical properties of TREE-G and demonstrate its benefits empirically on multiple graph and vertex prediction benchmarks. In these experiments, TREE-G consistently outperforms other tree-based models and often outperforms other graph-learning algorithms such as Graph Neural Networks (GNNs) and Graph Kernels, sometimes by large margins. Moreover, TREE-Gs models and their predictions can be explained and visualized
Paper Structure (42 sections, 8 theorems, 9 equations, 11 figures, 4 tables, 2 algorithms)

This paper contains 42 sections, 8 theorems, 9 equations, 11 figures, 4 tables, 2 algorithms.

Key Result

Lemma 4.1

TREE-G is invariant to permutations for graph labeling and equivariant for vertex labeling.

Figures (11)

  • Figure 1: One TREE-G tree in the ensemble of the Mutag experiment. Each node in the tree presents the split function and threshold in that node. The dashed arrow in each node is the pointer $*$ and it points to the ancestor split-node where the subset is taken from together with the value of $\rho \in \{+,-\}$ which indicates which subset from the two subsets generated in that ancestor to use.
  • Figure 2: The same trained TREE-G instance is applied to two graphs of different sizes during inference. Each split-node uses one subset among the subsets generated in its ancestor nodes, and the set of all vertices $S$. The subset to use in each split-node is uniquely defined by a pointer $*$ that points to the ancestor where the subset is generated, and $\rho\in \{+,-\}$ that indicates which of the two subsets generated in that ancestors should be used. Each split-node generates two subsets, by splitting its used subset. The subsets are computed using the same rules but translated to different subsets with respect to the given graphs.
  • Figure 3: Vertex-level explanations for two graphs from the Red Isolated Vertex problem (a) and two graphs from the Mutagenicity problem (b). The size of vertices corresponds to their importance according to the explanation mechanism.
  • Figure 4: $G_1$(red) and $G_2$(grey) are two graphs on 4 vertices that are positioned on the plane at $\{\pm 1\} \times \{\pm 1\}$. Each graph has two features corresponding to its coordinates. In the proof of Lemma 4.3 we show that these graphs are indistinguishable by TREE-G without subsets, but are separable when subsets are used.
  • Figure 5: Two $4$-regular non-isomporphic graphs. The two graphs differ, for example, in their number of cycles of length $3$. The figure is taken from idgnn
  • ...and 6 more figures

Theorems & Definitions (8)

  • Lemma 4.1
  • Lemma 4.2
  • Lemma 4.3
  • Lemma 4.4
  • Lemma
  • Lemma
  • Lemma
  • Lemma