Table of Contents
Fetching ...

Generalization Error of Graph Neural Networks in the Mean-field Regime

Gholamali Aminian, Yixuan He, Gesine Reinert, Łukasz Szpruch, Samuel N. Cohen

TL;DR

The work tackles the challenge of understanding generalization in over-parameterized graph neural networks for graph classification. It adopts a mean-field framework and a KL-regularized empirical risk to derive an analytically tractable Gibbs measure for network parameters, delivering two complementary analyses: a functional-derivative bound and a Rademacher-complexity bound, both yielding an $O\left(\frac{1}{n}\right)$ rate in the mean-field limit. The results are specialized to over-parameterized one-hidden-layer GCNs and MPGNNs, with explicit bounds that depend on graph properties and readout/aggregation choices, and show how graph filters influence generalization. Empirical validation on synthetic and real-world graphs supports the theory, offering concrete guidance on network width and readout design while highlighting promising directions for extending the framework to deeper or hypergraph architectures.

Abstract

This work provides a theoretical framework for assessing the generalization error of graph neural networks in the over-parameterized regime, where the number of parameters surpasses the quantity of data points. We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks. Prior to this study, existing bounds on the generalization error in the over-parametrized regime were uninformative, limiting our understanding of over-parameterized network performance. Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks. We establish upper bounds with a convergence rate of $O(1/n)$, where $n$ is the number of graph samples. These upper bounds offer a theoretical assurance of the networks' performance on unseen data in the challenging over-parameterized regime and overall contribute to our understanding of their performance.

Generalization Error of Graph Neural Networks in the Mean-field Regime

TL;DR

The work tackles the challenge of understanding generalization in over-parameterized graph neural networks for graph classification. It adopts a mean-field framework and a KL-regularized empirical risk to derive an analytically tractable Gibbs measure for network parameters, delivering two complementary analyses: a functional-derivative bound and a Rademacher-complexity bound, both yielding an rate in the mean-field limit. The results are specialized to over-parameterized one-hidden-layer GCNs and MPGNNs, with explicit bounds that depend on graph properties and readout/aggregation choices, and show how graph filters influence generalization. Empirical validation on synthetic and real-world graphs supports the theory, offering concrete guidance on network width and readout design while highlighting promising directions for extending the framework to deeper or hypergraph architectures.

Abstract

This work provides a theoretical framework for assessing the generalization error of graph neural networks in the over-parameterized regime, where the number of parameters surpasses the quantity of data points. We explore two widely utilized types of graph neural networks: graph convolutional neural networks and message passing graph neural networks. Prior to this study, existing bounds on the generalization error in the over-parametrized regime were uninformative, limiting our understanding of over-parameterized network performance. Our novel approach involves deriving upper bounds within the mean-field regime for evaluating the generalization error of these graph neural networks. We establish upper bounds with a convergence rate of , where is the number of graph samples. These upper bounds offer a theoretical assurance of the networks' performance on unseen data in the challenging over-parameterized regime and overall contribute to our understanding of their performance.
Paper Structure (38 sections, 25 theorems, 102 equations, 2 figures, 15 tables)

This paper contains 38 sections, 25 theorems, 102 equations, 2 figures, 15 tables.

Key Result

Proposition 4.6

Let Assumptions Ass: on loss NN, ass: bounded Unit function, and ass: readout function hold. Let $\mathrm{m}(\mu_n)\in\mathcal{P}(\mathcal{W})$. Then, for the generalization error of a generic GNN model, where $N_{\max}$ is the maximum number of nodes among all graph samples.

Figures (2)

  • Figure 1: Absolute empirical generalization error ($\times 10^5$) for different widths $h$ of the hidden layer. We employ a mean-readout function and a supervised ratio of $\beta_\text{sup}=0.7,$ for GCN and MPGNN. Values are averaged over ten runs. Error bars indicate one standard deviation.
  • Figure 2: Units of GCN and MPGNN

Theorems & Definitions (48)

  • Definition 2.1
  • Proposition 4.6
  • Remark 4.7
  • Proposition 4.8
  • Theorem 4.9: Generalization error and generic GNNs
  • Remark 4.10: Readout-function comparison
  • Remark 4.11: Comparison with aminian2023mean
  • Remark 4.12: Comparison with aminian2021exact
  • Proposition 4.13: Upper bound on the symmetrized KL divergence
  • Theorem 4.14: Generalization error upper bound via Rademacher complexity
  • ...and 38 more