Table of Contents
Fetching ...

On the Convergence and Size Transferability of Continuous-depth Graph Neural Networks

Mingsong Yan, Charles Kulick, Sui Tang

TL;DR

This work analyzes continuous-depth GNDEs on graphs by introducing Graphon-NDEs as the infinite-node limit, and proves well-posedness and trajectory-wise convergence of GNDEs to Graphon-NDEs under graphon limits. It derives explicit convergence rates for both weighted (smooth graphons) and unweighted (discontinuous graphons) graph sequences, and establishes size-transferability bounds that justify deploying GNDEs trained on moderate graphs to larger, structurally similar graphs. The results connect graphon theory with dynamical systems to provide uniform-in-time trajectory convergence and two-scale convergence for discretized GNDEs, with numerical experiments validating the theory on synthetic graphons and real-world node classification tasks. The findings support scalable, transfer-friendly GNDEs in AI-for-Science contexts, offering principled guidance for model design and solver choices in large graphs. The work also discusses practical computational costs and limitations, outlining directions for extending the framework to other GNN architectures and stochastic graphon models.

Abstract

Continuous-depth graph neural networks, also known as Graph Neural Differential Equations (GNDEs), combine the structural inductive bias of Graph Neural Networks (GNNs) with the continuous-depth architecture of Neural ODEs, offering a scalable and principled framework for modeling dynamics on graphs. In this paper, we present a rigorous convergence analysis of GNDEs with time-varying parameters in the infinite-node limit, providing theoretical insights into their size transferability. To this end, we introduce Graphon Neural Differential Equations (Graphon-NDEs) as the infinite-node limit of GNDEs and establish their well-posedness. Leveraging tools from graphon theory and dynamical systems, we prove the trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions. Moreover, we derive explicit convergence rates under two deterministic graph sampling regimes: (1) weighted graphs sampled from smooth graphons, and (2) unweighted graphs sampled from $\{0,1\}$-valued (discontinuous) graphons. We further establish size transferability bounds, providing theoretical justification for the practical strategy of transferring GNDE models trained on moderate-sized graphs to larger, structurally similar graphs without retraining. Numerical experiments using synthetic and real data support our theoretical findings.

On the Convergence and Size Transferability of Continuous-depth Graph Neural Networks

TL;DR

This work analyzes continuous-depth GNDEs on graphs by introducing Graphon-NDEs as the infinite-node limit, and proves well-posedness and trajectory-wise convergence of GNDEs to Graphon-NDEs under graphon limits. It derives explicit convergence rates for both weighted (smooth graphons) and unweighted (discontinuous graphons) graph sequences, and establishes size-transferability bounds that justify deploying GNDEs trained on moderate graphs to larger, structurally similar graphs. The results connect graphon theory with dynamical systems to provide uniform-in-time trajectory convergence and two-scale convergence for discretized GNDEs, with numerical experiments validating the theory on synthetic graphons and real-world node classification tasks. The findings support scalable, transfer-friendly GNDEs in AI-for-Science contexts, offering principled guidance for model design and solver choices in large graphs. The work also discusses practical computational costs and limitations, outlining directions for extending the framework to other GNN architectures and stochastic graphon models.

Abstract

Continuous-depth graph neural networks, also known as Graph Neural Differential Equations (GNDEs), combine the structural inductive bias of Graph Neural Networks (GNNs) with the continuous-depth architecture of Neural ODEs, offering a scalable and principled framework for modeling dynamics on graphs. In this paper, we present a rigorous convergence analysis of GNDEs with time-varying parameters in the infinite-node limit, providing theoretical insights into their size transferability. To this end, we introduce Graphon Neural Differential Equations (Graphon-NDEs) as the infinite-node limit of GNDEs and establish their well-posedness. Leveraging tools from graphon theory and dynamical systems, we prove the trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions. Moreover, we derive explicit convergence rates under two deterministic graph sampling regimes: (1) weighted graphs sampled from smooth graphons, and (2) unweighted graphs sampled from -valued (discontinuous) graphons. We further establish size transferability bounds, providing theoretical justification for the practical strategy of transferring GNDE models trained on moderate-sized graphs to larger, structurally similar graphs without retraining. Numerical experiments using synthetic and real data support our theoretical findings.

Paper Structure

This paper contains 53 sections, 12 theorems, 69 equations, 8 figures, 6 tables.

Key Result

Lemma 1

Let $\{\mathcal{G}_n\}$ be a sequence of graphs with adjacency matrices $\{{\bm{W}}_{\mathcal{G}_n}\}$. Suppose that $\{\mathcal{G}_n\}$ converges to a graphon ${\bm{\mathsfit{W}}}$ in the sense of homomorphism density. Then, there exists a sequence $\{\pi_n\}$ of permutations such that $\lim_{n\to\

Figures (8)

  • Figure 1: Infinite-node Limits of GNDEs
  • Figure 2: Hölder Tent (left), Tent (center-left), HSBM (center-right), and Hexaflake (right) graphon visualizations.
  • Figure 3: Convergence rates of GNDE solutions. Mean relative errors between GNDE and Graphon-NDE solutions on graphs sampled from four graphons: (1) Tent graphon (Lipschitz), matching $\mathcal{O}(1/n)$ rate in Theorem \ref{['theorem: rate of Lipschitz']}, (2) HSBM graphon (box counting dimension 1), (3) Hexaflake graphon (fractal boundary with box counting dimension 1.77), and (4) Hölder Tent which is Hölder-$\tfrac{1}{2}$ and exhibits a rate near $\mathcal{O}(1/n^{0.5})$ as expected from Theorem \ref{['theorem: rate of Lipschitz']}. The HSBM graphon yields faster convergence than the hexaflake, consistent with the trend indicated in Theorem \ref{['theorem: rate of simple graph']}. We refer to Figure \ref{['fig:error_bars_graphon_main']} in Section \ref{['appendix: additionalgraphons']} for their convergence plots with error bars.
  • Figure 4: Node classification experiment results, with two plots for each dataset. Left: Subgraph test accuracy (STA) and full graph test accuracy (FTA). Right: Transfer error (TE) and graphon error (GE).
  • Figure 5: Oscillatory Lipschitz (left), Checkerboard (center), and Sierpinski (right) graphon visualizations.
  • ...and 3 more figures

Theorems & Definitions (21)

  • Lemma 1
  • Definition 2
  • Theorem 3: Well-posedness, proof in Appendix \ref{['Appendix: well-posedness']}
  • Theorem 4: Trajectory-wise convergence, proof in Appendix \ref{['appendix: Proof of convergence of Graphon-NDEs']}
  • Theorem 5: Rates for weighted graphs, proof in Appendix \ref{['Appendix: proof of convergence rates']}
  • Theorem 6: Rates for unweighted graphs, proof in Appendix \ref{['Appendix: proof of convergence rates']}
  • Lemma 7
  • proof
  • Proposition 8
  • proof
  • ...and 11 more