Counting Substructures with Higher-Order Graph Neural Networks: Possibility and Impossibility Results
Behrooz Tahmasebi, Derek Lim, Stefanie Jegelka
TL;DR
This work addresses the gap between traditional MPNNs and costly higher-order GNNs for substructure counting by introducing Recursive Neighborhood Pooling (RNP-GNN). RNP-GNN recursively pools encodings from derived subgraphs, guided by covering sequences, to count subgraphs of size $k$ while exploiting graph sparsity to achieve reduced computational cost. The authors prove that RNP-GNNs can count any specified set of substructures, establish a universal-approximation result for local graph functions, and derive information-theoretic and ETH-based time-complexity lower bounds that contextualize the approach. Experiments on counting induced triangles, non-induced $3$-stars, and a satisfiability task demonstrate competitive or superior performance against baselines, highlighting the practical impact of sparsity-aware, recursion-based expressivity improvements.
Abstract
While message passing Graph Neural Networks (GNNs) have become increasingly popular architectures for learning with graphs, recent works have revealed important shortcomings in their expressive power. In response, several higher-order GNNs have been proposed that substantially increase the expressive power, albeit at a large computational cost. Motivated by this gap, we explore alternative strategies and lower bounds. In particular, we analyze a new recursive pooling technique of local neighborhoods that allows different tradeoffs of computational cost and expressive power. First, we prove that this model can count subgraphs of size $k$, and thereby overcomes a known limitation of low-order GNNs. Second, we show how recursive pooling can exploit sparsity to reduce the computational complexity compared to the existing higher-order GNNs. More generally, we provide a (near) matching information-theoretic lower bound for counting subgraphs with graph representations that pool over representations of derived (sub-)graphs. We also discuss lower bounds on time complexity.
