Homomorphism Counts for Graph Neural Networks: All About That Basis
Emily Jin, Michael Bronstein, İsmail İlkan Ceylan, Matthias Lanzinger
TL;DR
The paper addresses the expressivity gap in message-passing GNNs, which are bounded by the $1$-WL test, by introducing the homomorphism basis $\,\mathbb{B}_Γ$ for graph motif parameters. It proves that injecting the homomorphism counts for all graphs in the basis yields strictly more expressive models than counting a single motif or fixed subgraphs, without increasing asymptotic computational cost. The authors develop a practical two-step workflow: compute the basis and coefficients once, then count $\,\mathsf{Hom}(F, G)$ for each basis graph $F$, with node-level and graph-level variants, and validate the approach on ZINC, QM9, COLLAB, and BREC, achieving notable performance gains. This work provides a principled, scalable route to targeted expressivity in GNNs and broadens the scope of graph motif parameters beyond simple pattern counting by linking them to homomorphism counts.
Abstract
A large body of work has investigated the properties of graph neural networks and identified several limitations, particularly pertaining to their expressive power. Their inability to count certain patterns (e.g., cycles) in a graph lies at the heart of such limitations, since many functions to be learned rely on the ability of counting such patterns. Two prominent paradigms aim to address this limitation by enriching the graph features with subgraph or homomorphism pattern counts. In this work, we show that both of these approaches are sub-optimal in a certain sense and argue for a more fine-grained approach, which incorporates the homomorphism counts of all structures in the ``basis'' of the target pattern. This yields strictly more expressive architectures without incurring any additional overhead in terms of computational complexity compared to existing approaches. We prove a series of theoretical results on node-level and graph-level motif parameters and empirically validate them on standard benchmark datasets.
