Generalization Error of Graph Neural Networks in the Mean-field Regime

Gholamali Aminian; Yixuan He; Gesine Reinert; Łukasz Szpruch; Samuel N. Cohen

Generalization Error of Graph Neural Networks in the Mean-field Regime

Gholamali Aminian, Yixuan He, Gesine Reinert, Łukasz Szpruch, Samuel N. Cohen

TL;DR

The work tackles the challenge of understanding generalization in over-parameterized graph neural networks for graph classification. It adopts a mean-field framework and a KL-regularized empirical risk to derive an analytically tractable Gibbs measure for network parameters, delivering two complementary analyses: a functional-derivative bound and a Rademacher-complexity bound, both yielding an $O\left(\frac{1}{n}\right)$ rate in the mean-field limit. The results are specialized to over-parameterized one-hidden-layer GCNs and MPGNNs, with explicit bounds that depend on graph properties and readout/aggregation choices, and show how graph filters influence generalization. Empirical validation on synthetic and real-world graphs supports the theory, offering concrete guidance on network width and readout design while highlighting promising directions for extending the framework to deeper or hypergraph architectures.

Abstract

This work provides a theoretical framework for assessing the generalization error of graph neural networks in the over-parameterized regime, where the number of parameters surpasses the quantity of data points. We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks. Prior to this study, existing bounds on the generalization error in the over-parametrized regime were uninformative, limiting our understanding of over-parameterized network performance. Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks. We establish upper bounds with a convergence rate of $O(1/n)$, where $n$ is the number of graph samples. These upper bounds offer a theoretical assurance of the networks' performance on unseen data in the challenging over-parameterized regime and overall contribute to our understanding of their performance.

Generalization Error of Graph Neural Networks in the Mean-field Regime

TL;DR

rate in the mean-field limit. The results are specialized to over-parameterized one-hidden-layer GCNs and MPGNNs, with explicit bounds that depend on graph properties and readout/aggregation choices, and show how graph filters influence generalization. Empirical validation on synthetic and real-world graphs supports the theory, offering concrete guidance on network width and readout design while highlighting promising directions for extending the framework to deeper or hypergraph architectures.

Abstract

, where

is the number of graph samples. These upper bounds offer a theoretical assurance of the networks' performance on unseen data in the challenging over-parameterized regime and overall contribute to our understanding of their performance.

Paper Structure (38 sections, 25 theorems, 102 equations, 2 figures, 15 tables)

This paper contains 38 sections, 25 theorems, 102 equations, 2 figures, 15 tables.

Introduction
Notations:
Our Model
Preliminaries
Graph data samples and learning algorithm:
Graph filters:
Problem Formulation
Loss function:
Related Works
The Generalization Error of KL-Regularized Empirical Risk Minimization
Generalization Error via Functional Derivative
Generalization Error via Rademacher Complexity
Over-Parameterized One-Hidden-Layer GCN
Over-Parameterized One-Hidden-Layer MPGNN
Comparison to Previous Works
...and 23 more sections

Key Result

Proposition 4.6

Let Assumptions Ass: on loss NN, ass: bounded Unit function, and ass: readout function hold. Let $\mathrm{m}(\mu_n)\in\mathcal{P}(\mathcal{W})$. Then, for the generalization error of a generic GNN model, where $N_{\max}$ is the maximum number of nodes among all graph samples.

Figures (2)

Figure 1: Absolute empirical generalization error ($\times 10^5$) for different widths $h$ of the hidden layer. We employ a mean-readout function and a supervised ratio of $\beta_\text{sup}=0.7,$ for GCN and MPGNN. Values are averaged over ten runs. Error bars indicate one standard deviation.
Figure 2: Units of GCN and MPGNN

Theorems & Definitions (48)

Definition 2.1
Proposition 4.6
Remark 4.7
Proposition 4.8
Theorem 4.9: Generalization error and generic GNNs
Remark 4.10: Readout-function comparison
Remark 4.11: Comparison with aminian2023mean
Remark 4.12: Comparison with aminian2021exact
Proposition 4.13: Upper bound on the symmetrized KL divergence
Theorem 4.14: Generalization error upper bound via Rademacher complexity
...and 38 more

Generalization Error of Graph Neural Networks in the Mean-field Regime

TL;DR

Abstract

Generalization Error of Graph Neural Networks in the Mean-field Regime

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (48)