Table of Contents
Fetching ...

Generalization Bounds for Message Passing Networks on Mixture of Graphons

Sohir Maskey, Gitta Kutyniok, Ron Levie

TL;DR

This work derives non-vacuous uniform generalization bounds for Message Passing Neural Networks (MPNNs) when inputs are graphs/signals sampled from a mixture of graphons with Bernoulli edges, extending prior results to simple, sparse, and noisy graphs. The analysis introduces Random Graph-Signal Models (RGSMs) and continuous MPNNs (cMPNNs) to connect discrete graph computations with graphon limits, proving that the MPNN output on finite graphs converges to the graphon limit as the graph size grows. A main result shows that the generalization error decays with the average graph size under a sparsity constraint, implying effective generalization even when model complexity exceeds the training size, provided graphs are large enough. The authors validate the theory with experiments on synthetic RGSMs (Erdős–Rényi and SBM-like graphons), showing significantly tighter bounds than existing PAC-Bayes or Rademacher bounds and demonstrating practical relevance for large-scale graph learning. The work broadens theoretical understanding of GNN generalization by incorporating realistic sparsity and noise, and highlights future directions for other aggregation schemes and graph-modeling ingredients.

Abstract

We study the generalization capabilities of Message Passing Neural Networks (MPNNs), a prevalent class of Graph Neural Networks (GNN). We derive generalization bounds specifically for MPNNs with normalized sum aggregation and mean aggregation. Our analysis is based on a data generation model incorporating a finite set of template graphons. Each graph within this framework is generated by sampling from one of the graphons with a certain degree of perturbation. In particular, we extend previous MPNN generalization results to a more realistic setting, which includes the following modifications: 1) we analyze simple random graphs with Bernoulli-distributed edges instead of weighted graphs; 2) we sample both graphs and graph signals from perturbed graphons instead of clean graphons; and 3) we analyze sparse graphs instead of dense graphs. In this more realistic and challenging scenario, we provide a generalization bound that decreases as the average number of nodes in the graphs increases. Our results imply that MPNNs with higher complexity than the size of the training set can still generalize effectively, as long as the graphs are sufficiently large.

Generalization Bounds for Message Passing Networks on Mixture of Graphons

TL;DR

This work derives non-vacuous uniform generalization bounds for Message Passing Neural Networks (MPNNs) when inputs are graphs/signals sampled from a mixture of graphons with Bernoulli edges, extending prior results to simple, sparse, and noisy graphs. The analysis introduces Random Graph-Signal Models (RGSMs) and continuous MPNNs (cMPNNs) to connect discrete graph computations with graphon limits, proving that the MPNN output on finite graphs converges to the graphon limit as the graph size grows. A main result shows that the generalization error decays with the average graph size under a sparsity constraint, implying effective generalization even when model complexity exceeds the training size, provided graphs are large enough. The authors validate the theory with experiments on synthetic RGSMs (Erdős–Rényi and SBM-like graphons), showing significantly tighter bounds than existing PAC-Bayes or Rademacher bounds and demonstrating practical relevance for large-scale graph learning. The work broadens theoretical understanding of GNN generalization by incorporating realistic sparsity and noise, and highlights future directions for other aggregation schemes and graph-modeling ingredients.

Abstract

We study the generalization capabilities of Message Passing Neural Networks (MPNNs), a prevalent class of Graph Neural Networks (GNN). We derive generalization bounds specifically for MPNNs with normalized sum aggregation and mean aggregation. Our analysis is based on a data generation model incorporating a finite set of template graphons. Each graph within this framework is generated by sampling from one of the graphons with a certain degree of perturbation. In particular, we extend previous MPNN generalization results to a more realistic setting, which includes the following modifications: 1) we analyze simple random graphs with Bernoulli-distributed edges instead of weighted graphs; 2) we sample both graphs and graph signals from perturbed graphons instead of clean graphons; and 3) we analyze sparse graphs instead of dense graphs. In this more realistic and challenging scenario, we provide a generalization bound that decreases as the average number of nodes in the graphs increases. Our results imply that MPNNs with higher complexity than the size of the training set can still generalize effectively, as long as the graphs are sufficiently large.
Paper Structure (35 sections, 31 theorems, 186 equations, 2 figures)

This paper contains 35 sections, 31 theorems, 186 equations, 2 figures.

Key Result

Theorem 2.1

There exist constants $C, C'>0$ such that where $C$ and $C'$ are specified in appendix: Proof Gen Bound in the Appendix.

Figures (2)

  • Figure 1: Comparison of Generalization Bounds for GraphSage with mean aggregation: Our Theoretical Analysis vs. PAC-Bayesian (Liao et al., 2021) and Rademacher Complexity (Garg et al., 2020) for Binary Classification Using Erdös-Rényi and SBM Graphs. Each subplot corresponds to different sparsity levels $\alpha \in \{0,0.1,0.2,0.3\}$ of the underlying RGSM. For each subplot, we test six different training conditions: $T=1$ with weight decay (WD), $T=1$ without weight decay (w/o WD), $T=2$ with WD, $T=2$ w/o WD, $T=3$ with WD, and $T=3$ w/o WD.
  • Figure 2: Comparison of Generalization Bounds for GraphSage with normalized sum aggregation. See caption of \ref{['fig:mean']} for more details.

Theorems & Definitions (59)

  • Definition 2.1
  • Definition 2.2
  • Definition 2.3
  • Definition 2.4
  • Definition 2.5
  • Theorem 2.1
  • Proposition 3.1
  • Corollary 3.1
  • proof
  • Remark 3.1
  • ...and 49 more