Table of Contents
Fetching ...

Survey on Generalization Theory for Graph Neural Networks

Antonis Vasileiou, Stefanie Jegelka, Ron Levie, Christopher Morris

TL;DR

This survey addresses the gap in theory for how and when MPNNs generalize beyond training graphs. It synthesizes bounds from multiple formalisms—VC-dimension, Rademacher complexity, covering numbers, stability, PAC-Bayes, graphon theory, and transductive/OOD analyses—to provide a unified view of MPNN generalization across graph- and node-level tasks. Key insights include the connection between $1$-WL expressivity and VC bounds, the tighter nature of data-dependent Rademacher bounds for MPNNs, and graphon- and diffusion-matrix-based perspectives that yield distribution-aware guarantees. The work highlights open problems and future directions, such as leveraging graph-specific structure to tighten bounds, extending beyond $1$-WL-expressivity, and developing practical, informative bounds for real-world graph distributions with OOD and size-transfer considerations.

Abstract

Message-passing graph neural networks (MPNNs) have emerged as the leading approach for machine learning on graphs, attracting significant attention in recent years. While a large set of works explored the expressivity of MPNNs, i.e., their ability to separate graphs and approximate functions over them, comparatively less attention has been directed toward investigating their generalization abilities, i.e., making meaningful predictions beyond the training data. Here, we systematically review the existing literature on the generalization abilities of MPNNs. We analyze the strengths and limitations of various studies in these domains, providing insights into their methodologies and findings. Furthermore, we identify potential avenues for future research, aiming to deepen our understanding of the generalization abilities of MPNNs.

Survey on Generalization Theory for Graph Neural Networks

TL;DR

This survey addresses the gap in theory for how and when MPNNs generalize beyond training graphs. It synthesizes bounds from multiple formalisms—VC-dimension, Rademacher complexity, covering numbers, stability, PAC-Bayes, graphon theory, and transductive/OOD analyses—to provide a unified view of MPNN generalization across graph- and node-level tasks. Key insights include the connection between -WL expressivity and VC bounds, the tighter nature of data-dependent Rademacher bounds for MPNNs, and graphon- and diffusion-matrix-based perspectives that yield distribution-aware guarantees. The work highlights open problems and future directions, such as leveraging graph-specific structure to tighten bounds, extending beyond -WL-expressivity, and developing practical, informative bounds for real-world graph distributions with OOD and size-transfer considerations.

Abstract

Message-passing graph neural networks (MPNNs) have emerged as the leading approach for machine learning on graphs, attracting significant attention in recent years. While a large set of works explored the expressivity of MPNNs, i.e., their ability to separate graphs and approximate functions over them, comparatively less attention has been directed toward investigating their generalization abilities, i.e., making meaningful predictions beyond the training data. Here, we systematically review the existing literature on the generalization abilities of MPNNs. We analyze the strengths and limitations of various studies in these domains, providing insights into their methodologies and findings. Furthermore, we identify potential avenues for future research, aiming to deepen our understanding of the generalization abilities of MPNNs.

Paper Structure

This paper contains 68 sections, 27 theorems, 100 equations, 1 figure, 1 table.

Key Result

Theorem 1

Let $\mathcal{H}$ be a hypothesis class of MPNNs over the set of graphs $\mathcal{X}$, with $VC_{\mathcal{X}}(\mathcal{H})=d<\infty$. Then, for all $\delta \in (0,1)$, with probability at least $1-\delta$, the following holds for all MPNN $f\in\mathcal{H}$,

Figures (1)

  • Figure 1: Commutative diagram illustrating the graph compact embedding process. In this diagram, the message-passing phase is represented as a function that updates node features, producing a graph with the same structure but updated node features.

Theorems & Definitions (34)

  • Theorem 1: Vap+1964Vap+1998, adapted to MPNNs
  • Theorem 2
  • Proposition 3
  • Proposition 4
  • Theorem 5
  • Theorem 6
  • Theorem 7: DBLP:conf/ac/BousquetBL03[Theorem 5]
  • Lemma 8: MohriRostamizadehTalwalkar18
  • Proposition 9
  • Theorem 10
  • ...and 24 more