Table of Contents
Fetching ...

VC dimension of Graph Neural Networks with Pfaffian activation functions

Giuseppe Alessio D'Inverno, Monica Bianchini, Franco Scarselli

TL;DR

This work addresses the generalization capabilities of Graph Neural Networks when using Pfaffian activation functions by deriving VC-dimension bounds that depend on architectural factors (depth $L$, hidden size $d$, total parameter count $ar{p}$) and the number of 1–WL colors. The authors provide a unified bound framework showing the VC dimension scales polynomially with key parameters, notably through terms like $O(ar{p}^2 H^2)$ with $H = LNd( ext{ell}_{comb}+ ext{ell}_{agg})+ ext{ell}_{read}$, and they extend color-based refinements to tighten bounds via the WL colors, $C_1$. They validate the theory with preliminary experiments on binary graph-classification datasets using activations such as $ ext{arctan}$ and $ anh$, observing training–test gap behavior consistent with the derived bounds and color-based predictions. Overall, the paper advances understanding of how Pfaffian activations influence GNN generalization and suggests avenues for future work, including lower bounds and extensions to Graph Transformers and related graph models.

Abstract

Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion; based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked with the Weisfeiler-Lehman (WL) test for graph isomorphism, to which they have proven equivalent. From a theoretical point of view, GNNs have been shown to be universal approximators, and their generalization capability (namely, bounds on the Vapnik Chervonekis (VC) dimension) has recently been investigated for GNNs with piecewise polynomial activation functions. The aim of our work is to extend this analysis on the VC dimension of GNNs to other commonly used activation functions, such as sigmoid and hyperbolic tangent, using the framework of Pfaffian function theory. Bounds are provided with respect to architecture parameters (depth, number of neurons, input size) as well as with respect to the number of colors resulting from the 1-WL test applied on the graph domain. The theoretical analysis is supported by a preliminary experimental study.

VC dimension of Graph Neural Networks with Pfaffian activation functions

TL;DR

This work addresses the generalization capabilities of Graph Neural Networks when using Pfaffian activation functions by deriving VC-dimension bounds that depend on architectural factors (depth , hidden size , total parameter count ) and the number of 1–WL colors. The authors provide a unified bound framework showing the VC dimension scales polynomially with key parameters, notably through terms like with , and they extend color-based refinements to tighten bounds via the WL colors, . They validate the theory with preliminary experiments on binary graph-classification datasets using activations such as and , observing training–test gap behavior consistent with the derived bounds and color-based predictions. Overall, the paper advances understanding of how Pfaffian activations influence GNN generalization and suggests avenues for future work, including lower bounds and extensions to Graph Transformers and related graph models.

Abstract

Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion; based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked with the Weisfeiler-Lehman (WL) test for graph isomorphism, to which they have proven equivalent. From a theoretical point of view, GNNs have been shown to be universal approximators, and their generalization capability (namely, bounds on the Vapnik Chervonekis (VC) dimension) has recently been investigated for GNNs with piecewise polynomial activation functions. The aim of our work is to extend this analysis on the VC dimension of GNNs to other commonly used activation functions, such as sigmoid and hyperbolic tangent, using the framework of Pfaffian function theory. Bounds are provided with respect to architecture parameters (depth, number of neurons, input size) as well as with respect to the number of colors resulting from the 1-WL test applied on the graph domain. The theoretical analysis is supported by a preliminary experimental study.
Paper Structure (26 sections, 8 theorems, 32 equations, 7 figures, 3 tables)

This paper contains 26 sections, 8 theorems, 32 equations, 7 figures, 3 tables.

Key Result

Theorem 1

Let us consider the GNN model described by Eq. def:gnn_upd. If $\mathsf{COMBINE}^{(t)}$, $\mathsf{AGGREGATE}^{(t)}$ and $\mathsf{READOUT}$ are Pfaffian functions with format $(\alpha_{\mathsf{comb}}, \beta_{\mathsf{comb}}, \ell_{\mathsf{comb}})$, $(\alpha_{\mathsf{agg}}, \beta_{\mathsf{agg}}, \ell_{ where $B\leq 2^{\frac{\bar{\ell}(\bar{\ell}-1)}{2}+1}(\Bar{\alpha} + 2 \Bar{\beta} -1)^{\Bar{p}-1}

Figures (7)

  • Figure 1: Results on the task E1 for GNNs with activation function $\mathsf{arctan}$.
  • Figure 2: Results on the task E1 for GNNs with activation function $\mathsf{tanh}$.
  • Figure 3: Results on the task E2 for GNNs with activation function $\mathsf{tanh}$.
  • Figure 4: Results on the task E1 for GNNs with activation function $\mathsf{atan}$ over the dataset PROTEINS.
  • Figure 5: Results on the task E1 for GNNs with activation function $\mathsf{atan}$ over the dataset PTC-MR.
  • ...and 2 more figures

Theorems & Definitions (11)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Theorem 6: gabrielov2004complexity
  • Lemma 7
  • proof
  • proof
  • Lemma 8
  • ...and 1 more