VC dimension of Graph Neural Networks with Pfaffian activation functions
Giuseppe Alessio D'Inverno, Monica Bianchini, Franco Scarselli
TL;DR
This work addresses the generalization capabilities of Graph Neural Networks when using Pfaffian activation functions by deriving VC-dimension bounds that depend on architectural factors (depth $L$, hidden size $d$, total parameter count $ar{p}$) and the number of 1–WL colors. The authors provide a unified bound framework showing the VC dimension scales polynomially with key parameters, notably through terms like $O(ar{p}^2 H^2)$ with $H = LNd( ext{ell}_{comb}+ ext{ell}_{agg})+ ext{ell}_{read}$, and they extend color-based refinements to tighten bounds via the WL colors, $C_1$. They validate the theory with preliminary experiments on binary graph-classification datasets using activations such as $ ext{arctan}$ and $ anh$, observing training–test gap behavior consistent with the derived bounds and color-based predictions. Overall, the paper advances understanding of how Pfaffian activations influence GNN generalization and suggests avenues for future work, including lower bounds and extensions to Graph Transformers and related graph models.
Abstract
Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion; based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked with the Weisfeiler-Lehman (WL) test for graph isomorphism, to which they have proven equivalent. From a theoretical point of view, GNNs have been shown to be universal approximators, and their generalization capability (namely, bounds on the Vapnik Chervonekis (VC) dimension) has recently been investigated for GNNs with piecewise polynomial activation functions. The aim of our work is to extend this analysis on the VC dimension of GNNs to other commonly used activation functions, such as sigmoid and hyperbolic tangent, using the framework of Pfaffian function theory. Bounds are provided with respect to architecture parameters (depth, number of neurons, input size) as well as with respect to the number of colors resulting from the 1-WL test applied on the graph domain. The theoretical analysis is supported by a preliminary experimental study.
