Connectivity-Guided Sparsification of 2-FWL GNNs: Preserving Full Expressivity with Improved Efficiency
Rongqin Chen, Fan Mo, Pak Lon Ip, Shenghui Zhang, Dan Wu, Ye Li, Leong Hou U
TL;DR
This work addresses the tension between expressivity and efficiency in higher-order GNNs by focusing on 2-FWL models and introducing Co-Sparsify, a connectivity-guided sparsification that preserves full 2-FWL expressivity. By restricting 3-node interactions to within biconnected components and 2-node interactions to within connected components, and leveraging Tarjan/block-cut decompositions, the approach achieves $O(n+m)$ preprocessing and substantial per-layer gains without sampling or approximation. The authors prove expressivity equivalence to 2-FWL under injective aggregation and demonstrate empirical gains on substructure counting and real-world benchmarks (e.g., ZINC, QM9), achieving state-of-the-art results with reduced resources. They also discuss limitations, notably in long-range tasks, and propose extensions like distance-aware sparsification and adaptive receptive fields to balance generalization and expressivity in scalable GNNs.
Abstract
Higher-order Graph Neural Networks (HOGNNs) based on the 2-FWL test achieve superior expressivity by modeling 2- and 3-node interactions, but at $\mathcal{O}(n^3)$ computational cost. However, this computational burden is typically mitigated by existing efficiency methods at the cost of reduced expressivity. We propose \textbf{Co-Sparsify}, a connectivity-aware sparsification framework that eliminates \emph{provably redundant} computations while preserving full 2-FWL expressive power. Our key insight is that 3-node interactions are expressively necessary only within \emph{biconnected components} -- maximal subgraphs where every pair of nodes lies on a cycle. Outside these components, structural relationships can be fully captured via 2-node message passing or global readout, rendering higher-order modeling unnecessary. Co-Sparsify restricts 2-node message passing to connected components and 3-node interactions to biconnected ones, removing computation without approximation or sampling. We prove that Co-Sparsified GNNs are as expressive as the 2-FWL test. Empirically, on PPGN, Co-Sparsify matches or exceeds accuracy on synthetic substructure counting tasks and achieves state-of-the-art performance on real-world benchmarks (ZINC, QM9). This study demonstrates that high expressivity and scalability are not mutually exclusive: principled, topology-guided sparsification enables powerful, efficient GNNs with theoretical guarantees.
