Convergence of Message Passing Graph Neural Networks with Generic Aggregation On Large Random Graphs
Matthieu Cordonnier, Nicolas Keriven, Nicolas Tremblay, Samuel Vaiter
TL;DR
The paper studies convergence of multilayer MPGNNs with generic aggregation on large random graphs to continuous GNNs (cMPGNNs). It defines a continuous counterpart on a latent space with a connectivity kernel and analyzes when the discrete MPGNN outputs concentrate near the continuous limit as $n$ grows, providing non-asymptotic, high-probability bounds. Key contributions include a McDiarmid-based bound for Lipschitz-type aggregations and a separate, dimension-dependent bound for max aggregation, along with detailed verification across several common aggregation schemes (mean, degree-normalized, attention, generalized mean, and max). The results extend prior convergence analyses from SGNNs and degree-normalized mean MPGNNs to a broad class of aggregations, informing theoretical understanding and practical behavior of GNNs on large graphs. Experimental illustrations corroborate the predicted rates, highlighting how convergence scales with latent-space dimension for mean versus max aggregations, and suggesting practical implications for theory-guided design of GNN architectures on large networks.
Abstract
We study the convergence of message passing graph neural networks on random graph models to their continuous counterpart as the number of nodes tends to infinity. Until now, this convergence was only known for architectures with aggregation functions in the form of normalized means, or, equivalently, of an application of classical operators like the adjacency matrix or the graph Laplacian. We extend such results to a large class of aggregation functions, that encompasses all classically used message passing graph neural networks, such as attention-based message passing, max convolutional message passing, (degree-normalized) convolutional message passing, or moment-based aggregation message passing. Under mild assumptions, we give non-asymptotic bounds with high probability to quantify this convergence. Our main result is based on the McDiarmid inequality. Interestingly, this result does not apply to the case where the aggregation is a coordinate-wise maximum. We treat this case separately and obtain a different convergence rate.
