Resolving Node Identifiability in Graph Neural Processes via Laplacian Spectral Encodings
Zimo Yan, Zheng Xie, Chang Liu, Yuan Wang
TL;DR
This work identifies a fundamental expressiveness gap in graph neural processes: standard WL-bounded encoders cannot distinguish symmetric nodes, leading to high Bayes risk. By introducing a sign-/basis-invariant Laplacian spectral positional encoding and an anchor-based diffusion trilateration scheme, the authors prove a sample-complexity separation, achieving constant-shot identifiability on random $r$-regular graphs. They validate the theory with a drug-drug interaction prediction task, showing substantial improvements in AUROC and F1 when using Laplacian PEs, and demonstrate faster convergence in transductive settings. The results bridge expressive power limitations with principled positional information, offering scalable, robust enhancements for probabilistic graph models in real-world applications. The work also outlines future directions for scalable spectral methods and extensions to broader graph types and dynamic networks.
Abstract
Message passing graph neural networks are widely used for learning on graphs, yet their expressive power is limited by the one-dimensional Weisfeiler-Lehman test and can fail to distinguish structurally different nodes. We provide rigorous theory for a Laplacian positional encoding that is invariant to eigenvector sign flips and to basis rotations within eigenspaces. We prove that this encoding yields node identifiability from a constant number of observations and establishes a sample-complexity separation from architectures constrained by the Weisfeiler-Lehman test. The analysis combines a monotone link between shortest-path and diffusion distance, spectral trilateration with a constant set of anchors, and quantitative spectral injectivity with logarithmic embedding size. As an instantiation, pairing this encoding with a neural-process style decoder yields significant gains on a drug-drug interaction task on chemical graphs, improving both the area under the ROC curve and the F1 score and demonstrating the practical benefits of resolving theoretical expressiveness limitations with principled positional information.
