Foundations and Frontiers of Graph Learning Theory
Yu Huang, Min Zhou, Menglin Yang, Zhen Wang, Muhan Zhang, Jie Wang, Hong Xie, Hao Wang, Defu Lian, Enhong Chen
TL;DR
This paper surveys the theoretical foundations of graph learning, focusing on three core pillars—expressive power, generalization, and optimization—and also addresses long-range interactions via over-smoothing and over-squashing. It synthesizes WL-based expressivity, higher-order and subgraph approaches, invariance/equivariance, and connections to combinatorial problems, while outlining generalization bounds (VC-dim, Rademacher, PAC-Bayes, stability, GNTK) and optimization dynamics (NTK regime, initialization, normalization, sampling). The work also discusses practical strategies to mitigate deep-GNN pathologies (skip connections, ODE-based models, graph rewiring) and outlines open questions linking theory to real-world graph tasks. Overall, it provides a structured theory of graph learning with clear directions toward more powerful, generalizable, and scalable graph models, including graph transformers and geometry-aware architectures.
Abstract
Recent advancements in graph learning have revolutionized the way to understand and analyze data with complex structures. Notably, Graph Neural Networks (GNNs), i.e. neural network architectures designed for learning graph representations, have become a popular paradigm. With these models being usually characterized by intuition-driven design or highly intricate components, placing them within the theoretical analysis framework to distill the core concepts, helps understand the key principles that drive the functionality better and guide further development. Given this surge in interest, this article provides a comprehensive summary of the theoretical foundations and breakthroughs concerning the approximation and learning behaviors intrinsic to prevalent graph learning models. Encompassing discussions on fundamental aspects such as expressiveness power, generalization, optimization, and unique phenomena such as over-smoothing and over-squashing, this piece delves into the theoretical foundations and frontier driving the evolution of graph learning. In addition, this article also presents several challenges and further initiates discussions on possible solutions.
