Which Algorithms Can Graph Neural Networks Learn?
Solveig Wittig, Antonis Vasileiou, Robert R. Nerem, Timo Stoll, Floris Geerts, Yusu Wang, Christopher Morris
TL;DR
This work provides a principled theory for when graph neural networks can learn and generalize discrete graph algorithms. By linking algorithmic invariants to Lipschitz properties, covering numbers, and differentiable regularization, the authors characterize which algorithms can be learned from finite data and extrapolated to arbitrarily large graphs, including SSSP, MST, and DP style problems like the $0$-$1$ knapsack. They show both positive results for learnable classes (via normalized sum, mean, and min/max aggregations) and impossibility results for standard MPNNs on certain tasks, while offering more expressive architectures to overcome these limits. A key advance is a differentiable regularization approach that tightens Bellman--Ford extrapolation and enables explicit, small training sets with provable size generalization guarantees. The empirical study supports the theory, demonstrating practical gains in size generalization and the benefits of the proposed regularization in learning graph algorithms from limited data.
Abstract
In recent years, there has been growing interest in understanding neural architectures' ability to learn to execute discrete algorithms, a line of work often referred to as neural algorithmic reasoning. The goal is to integrate algorithmic reasoning capabilities into larger neural pipelines. Many such architectures are based on (message-passing) graph neural networks (MPNNs), owing to their permutation equivariance and ability to deal with sparsity and variable-sized inputs. However, existing work is either largely empirical and lacks formal guarantees or it focuses solely on expressivity, leaving open the question of when and how such architectures generalize beyond a finite training set. In this work, we propose a general theoretical framework that characterizes the sufficient conditions under which MPNNs can learn an algorithm from a training set of small instances and provably approximate its behavior on inputs of arbitrary size. Our framework applies to a broad class of algorithms, including single-source shortest paths, minimum spanning trees, and general dynamic programming problems, such as the $0$-$1$ knapsack problem. In addition, we establish impossibility results for a wide range of algorithmic tasks, showing that standard MPNNs cannot learn them, and we derive more expressive MPNN-like architectures that overcome these limitations. Finally, we refine our analysis for the Bellman-Ford algorithm, yielding a substantially smaller required training set and significantly extending the recent work of Nerem et al. [2025] by allowing for a differentiable regularization loss. Empirical results largely support our theoretical findings.
