Boosting Graph Neural Network Expressivity with Learnable Lanczos Constraints
Niloofar Azizi, Nils Kriege, Horst Bischof
TL;DR
The paper tackles the limited expressivity of standard GNNs, which are bounded by $1$-WL and struggle with link prediction due to automorphic nodes. It proposes LLwLC, a Learnable Lanczos framework that injects linear constraints—derived from induced subgraphs—into the Laplacian eigenbasis to learn richer spectral features, implemented via a two-loop Lanczos process and a constrained LS solver. Two constraint policies are explored: Neumann eigenvalue constraints for linking representations and vertex-deleted subgraph constraints to distinguish graphs with identical WL signatures; stochastic constraint selection keeps computation practical. Empirically, LLwLCNet achieves state-of-the-art or competitive LP performance on benchmarks with far fewer parameters and data, and demonstrates substantial speedups, validating its theoretical claims on enhanced expressivity and practical utility.
Abstract
Graph Neural Networks (GNNs) excel in handling graph-structured data but often underperform in link prediction tasks compared to classical methods, mainly due to the limitations of the commonly used message-passing principle. Notably, their ability to distinguish non-isomorphic graphs is limited by the 1-dimensional Weisfeiler-Lehman test. Our study presents a novel method to enhance the expressivity of GNNs by embedding induced subgraphs into the graph Laplacian matrix's eigenbasis. We introduce a Learnable Lanczos algorithm with Linear Constraints (LLwLC), proposing two novel subgraph extraction strategies: encoding vertex-deleted subgraphs and applying Neumann eigenvalue constraints. For the former, we demonstrate the ability to distinguish graphs that are indistinguishable by 2-WL, while maintaining efficient time complexity. The latter focuses on link representations enabling differentiation between $k$-regular graphs and node automorphism, a vital aspect for link prediction tasks. Our approach results in an extremely lightweight architecture, reducing the need for extensive training datasets. Empirically, our method improves performance in challenging link prediction tasks across benchmark datasets, establishing its practical utility and supporting our theoretical findings. Notably, LLwLC achieves 20x and 10x speedup by only requiring 5% and 10% data from the PubMed and OGBL-Vessel datasets while comparing to the state-of-the-art.
