Table of Contents
Fetching ...

Performance Heterogeneity in Graph Neural Networks: Lessons for Architecture Design and Preprocessing

Lukas Fesser, Melanie Weber

TL;DR

The paper investigates performance heterogeneity in graph-level learning across both message-passing and transformer-based GNNs, revealing that graph topology alone cannot explain per-graph differences. It introduces heterogeneity profiles and leverages the Tree Mover's Distance to connect class-distance ratios with heterogeneity, guiding practical design choices. The authors propose selective rewiring to align graph spectra and a spectral-depth heuristic based on the Fiedler value ($\lambda_2$), demonstrating improved per-graph performance on benchmarks. These insights offer concrete guidelines for preprocessing and architecture selection in heterogeneous graph datasets, with potential to automate model choices in GNN pipelines.

Abstract

Graph Neural Networks have emerged as the most popular architecture for graph-level learning, including graph classification and regression tasks, which frequently arise in areas such as biochemistry and drug discovery. Achieving good performance in practice requires careful model design. Due to gaps in our understanding of the relationship between model and data characteristics, this often requires manual architecture and hyperparameter tuning. This is particularly pronounced in graph-level tasks, due to much higher variation in the input data than in node-level tasks. To work towards closing these gaps, we begin with a systematic analysis of individual performance in graph-level tasks. Our results establish significant performance heterogeneity in both message-passing and transformer-based architectures. We then investigate the interplay of model and data characteristics as drivers of the observed heterogeneity. Our results suggest that graph topology alone cannot explain heterogeneity. Using the Tree Mover's Distance, which jointly evaluates topological and feature information, we establish a link between class-distance ratios and performance heterogeneity in graph classification. These insights motivate model and data preprocessing choices that account for heterogeneity between graphs. We propose a selective rewiring approach, which only targets graphs whose individual performance benefits from rewiring. We further show that the optimal network depth depends on the graph's spectrum, which motivates a heuristic for choosing the number of GNN layers. Our experiments demonstrate the utility of both design choices in practice.

Performance Heterogeneity in Graph Neural Networks: Lessons for Architecture Design and Preprocessing

TL;DR

The paper investigates performance heterogeneity in graph-level learning across both message-passing and transformer-based GNNs, revealing that graph topology alone cannot explain per-graph differences. It introduces heterogeneity profiles and leverages the Tree Mover's Distance to connect class-distance ratios with heterogeneity, guiding practical design choices. The authors propose selective rewiring to align graph spectra and a spectral-depth heuristic based on the Fiedler value (), demonstrating improved per-graph performance on benchmarks. These insights offer concrete guidelines for preprocessing and architecture selection in heterogeneous graph datasets, with potential to automate model choices in GNN pipelines.

Abstract

Graph Neural Networks have emerged as the most popular architecture for graph-level learning, including graph classification and regression tasks, which frequently arise in areas such as biochemistry and drug discovery. Achieving good performance in practice requires careful model design. Due to gaps in our understanding of the relationship between model and data characteristics, this often requires manual architecture and hyperparameter tuning. This is particularly pronounced in graph-level tasks, due to much higher variation in the input data than in node-level tasks. To work towards closing these gaps, we begin with a systematic analysis of individual performance in graph-level tasks. Our results establish significant performance heterogeneity in both message-passing and transformer-based architectures. We then investigate the interplay of model and data characteristics as drivers of the observed heterogeneity. Our results suggest that graph topology alone cannot explain heterogeneity. Using the Tree Mover's Distance, which jointly evaluates topological and feature information, we establish a link between class-distance ratios and performance heterogeneity in graph classification. These insights motivate model and data preprocessing choices that account for heterogeneity between graphs. We propose a selective rewiring approach, which only targets graphs whose individual performance benefits from rewiring. We further show that the optimal network depth depends on the graph's spectrum, which motivates a heuristic for choosing the number of GNN layers. Our experiments demonstrate the utility of both design choices in practice.

Paper Structure

This paper contains 41 sections, 1 theorem, 12 equations, 17 figures, 6 tables.

Key Result

Theorem 3.1

Given an $L$-layer GNN $h: \mathcal{X} \to \mathbb{R}$ and two graphs $G, G' \in \mathcal{D}$, we have $\| h(G) - h(G') \| \leq \prod_{l=1}^{L+1} K_{\phi}^{(l)} \cdot \text{TMD}_w^{L+1}(G, G')$, where $w(l) = \epsilon \cdot \frac{P_{L+1}^{l-1}}{P_{L+1}^{l}}$ for all $l \leq L$ and $P_L^l$ is the $l$

Figures (17)

  • Figure 1: Comparison of heterogeneity profiles for message-passing (GCN) and transformer-based (GraphGPS) architectures.
  • Figure 2: Two graphs from the Mutag dataset, left with the correct functional group (label = 1), and right with the wrong one (label = 0). The graphs have a similar topological structure (e.g., presence of a 3-fork), but subtle differences in the associated node features.
  • Figure 3: Class-distance ratios for graphs in Mutag and Enzymes.
  • Figure 4: Training performance comparison on the Mutag dataset using GCN (a MPGNN, left) and GraphGPS (a GT, right).
  • Figure 5: Comparison of BORF and FoSR methods applied to GCN on the Enzymes and Proteins datasets, sorted by FoSR changes from best to worst.
  • ...and 12 more figures

Theorems & Definitions (7)

  • Definition 1: Class-distance ratio
  • Theorem 3.1: chuang2022tree, Theorem 8
  • Definition 2: Computation Trees (chuang2022tree, Def. 1)
  • Definition 3: Blank Tree (chuang2022tree, Def. 2)
  • Definition 4: Blank Tree Augmentation (chuang2022tree, Def. 3)
  • Definition 5: Tree Distance (chuang2022tree, Def. 4)
  • Definition 6: TMD (chuang2022tree, Def. 5)