Understanding Generalization in Node and Link Prediction

Antonis Vasileiou; Timo Stoll; Christopher Morris

Understanding Generalization in Node and Link Prediction

Antonis Vasileiou, Timo Stoll, Christopher Morris

TL;DR

This work tackles the challenge of understanding generalization for node- and link-prediction with graph neural networks under non-i.i.d. data. It introduces generalized MPNNs (gMPNNs) and unrolling distances that faithfully capture the computation on graphs, and derives robustness-based generalization bounds for both inductive and transductive settings, explicitly accounting for graph structure and sample dependencies. The theory is complemented by experiments showing that the unrolling distance correlates with MPNN outputs, that training across many graphs improves generalization, and that the derived bounds reflect observed generalization gaps. Overall, the framework provides a principled, architecture-inclusive lens for analyzing graph-dependent generalization and offers pathways to extend these insights beyond graph-structured data.

Abstract

Using message-passing graph neural networks (MPNNs) for node and link prediction is crucial in various scientific and industrial domains, which has led to the development of diverse MPNN architectures. Besides working well in practical settings, their ability to generalize beyond the training set remains poorly understood. While some studies have explored MPNNs' generalization in graph-level prediction tasks, much less attention has been given to node- and link-level predictions. Existing works often rely on unrealistic i.i.d.\@ assumptions, overlooking possible correlations between nodes or links, and assuming fixed aggregation and impractical loss functions while neglecting the influence of graph structure. In this work, we introduce a unified framework to analyze the generalization properties of MPNNs in inductive and transductive node and link prediction settings, incorporating diverse architectural parameters and loss functions and quantifying the influence of graph structure. Additionally, our proposed generalization framework can be applied beyond graphs to any classification task under the inductive or transductive setting. Our empirical study supports our theoretical insights, deepening our understanding of MPNNs' generalization capabilities in these tasks.

Understanding Generalization in Node and Link Prediction

TL;DR

Abstract

Understanding Generalization in Node and Link Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (44)