Table of Contents
Fetching ...

Future Directions in the Theory of Graph Machine Learning

Christopher Morris, Fabrizio Frasca, Nadav Dym, Haggai Maron, İsmail İlkan Ceylan, Ron Levie, Derek Lim, Michael Bronstein, Martin Grohe, Stefanie Jegelka

TL;DR

The paper argues for a balanced theory of graph machine learning that goes beyond coarse combinatorial expressivity (e.g., $1$-WL) to incorporate geometry, generalization, and optimization, all aligned with practical applications. It outlines a comprehensive program of challenges across four pillars: expressive power (II.1–II.6), generalization (III.1–III.4), optimization dynamics (IV.1–IV.5), and practice alignment (V.1–V.5), advocating geometry-based expressivity, uniform bounds, graph-class awareness, and domain-adaptive architectures. A core idea is to develop a metric-based relationship between graph space and GNN feature space (e.g., a bi-Lipschitz correspondence between $d_G$ and $d_F$) to enable finer analyses and universal approximation statements, while also studying data augmentation, extrapolation, and optimization under realistic conditions. The authors propose concrete action items, including a Theo-practical Dojo, a library of theoretically guided implementations, domain-adapted architectures, and principled integration of LLMs, to translate theory into practice and accelerate impact in domains such as molecular design and combinatorial optimization.

Abstract

Machine learning on graphs, especially using graph neural networks (GNNs), has seen a surge in interest due to the wide availability of graph data across a broad spectrum of disciplines, from life to social and engineering sciences. Despite their practical success, our theoretical understanding of the properties of GNNs remains highly incomplete. Recent theoretical advancements primarily focus on elucidating the coarse-grained expressive power of GNNs, predominantly employing combinatorial techniques. However, these studies do not perfectly align with practice, particularly in understanding the generalization behavior of GNNs when trained with stochastic first-order optimization techniques. In this position paper, we argue that the graph machine learning community needs to shift its attention to developing a balanced theory of graph machine learning, focusing on a more thorough understanding of the interplay of expressive power, generalization, and optimization.

Future Directions in the Theory of Graph Machine Learning

TL;DR

The paper argues for a balanced theory of graph machine learning that goes beyond coarse combinatorial expressivity (e.g., -WL) to incorporate geometry, generalization, and optimization, all aligned with practical applications. It outlines a comprehensive program of challenges across four pillars: expressive power (II.1–II.6), generalization (III.1–III.4), optimization dynamics (IV.1–IV.5), and practice alignment (V.1–V.5), advocating geometry-based expressivity, uniform bounds, graph-class awareness, and domain-adaptive architectures. A core idea is to develop a metric-based relationship between graph space and GNN feature space (e.g., a bi-Lipschitz correspondence between and ) to enable finer analyses and universal approximation statements, while also studying data augmentation, extrapolation, and optimization under realistic conditions. The authors propose concrete action items, including a Theo-practical Dojo, a library of theoretically guided implementations, domain-adapted architectures, and principled integration of LLMs, to translate theory into practice and accelerate impact in domains such as molecular design and combinatorial optimization.

Abstract

Machine learning on graphs, especially using graph neural networks (GNNs), has seen a surge in interest due to the wide availability of graph data across a broad spectrum of disciplines, from life to social and engineering sciences. Despite their practical success, our theoretical understanding of the properties of GNNs remains highly incomplete. Recent theoretical advancements primarily focus on elucidating the coarse-grained expressive power of GNNs, predominantly employing combinatorial techniques. However, these studies do not perfectly align with practice, particularly in understanding the generalization behavior of GNNs when trained with stochastic first-order optimization techniques. In this position paper, we argue that the graph machine learning community needs to shift its attention to developing a balanced theory of graph machine learning, focusing on a more thorough understanding of the interplay of expressive power, generalization, and optimization.
Paper Structure (30 sections, 2 figures)

This paper contains 30 sections, 2 figures.

Figures (2)

  • Figure 1: Interactions of the four challenges within graph machine learning: Fine-grained expressivity, generalization, optimization, applications, and their interactions. The green boxes architectural choices (hyperparameter and other design choices like normalization layers), model parameters, and graph classes (different types of graphs) represent aspects of all four challenges.
  • Figure 2: Proposal for a better alignment of theoretical and practical research within the graph machine learning community. We propose the tight interaction and iterative refinement of mathematical models and architectural choices via rigorous experimental evaluations supported by state-of-the-art baseline implementations, benchmarks, evaluation pipelines, and visual exploration tools.