Large-Scale Spectral Graph Neural Networks via Laplacian Sparsification: Technical Report
Haipeng Ding, Zhewei Wei, Yuhang Ye
TL;DR
This work tackles the scalability barrier of spectral GNNs by introducing SGNN-LS, a Laplacian sparsification framework that approximates the propagation operator L_K=∑_{k=0}^K w_k L^k with a sparse graph while preserving end-to-end training. It provides rigorous guarantees: the sparsifier achieves spectral similarity within a tolerance ε with high probability and uses O(n log n / ε^2) edges, with the propagated signals differing by only O(ε) in the relevant loss. The method supports both static and learnable polynomial coefficients and introduces node-wise sampling for semi-supervised tasks, enabling mini-batch training on very large graphs. Empirical results on large-scale datasets such as Ogbn-papers100M and MAG-scholar-C demonstrate competitive accuracy and substantial memory and time efficiency, confirming the practical impact of the approach and its compatibility with existing scalable GNN strategies.
Abstract
Graph Neural Networks (GNNs) play a pivotal role in graph-based tasks for their proficiency in representation learning. Among the various GNN methods, spectral GNNs employing polynomial filters have shown promising performance on tasks involving both homophilous and heterophilous graph structures. However, The scalability of spectral GNNs on large graphs is limited because they learn the polynomial coefficients through multiple forward propagation executions during forward propagation. Existing works have attempted to scale up spectral GNNs by eliminating the linear layers on the input node features, a change that can disrupt end-to-end training, potentially impact performance, and become impractical with high-dimensional input features. To address the above challenges, we propose "Spectral Graph Neural Networks with Laplacian Sparsification (SGNN-LS)", a novel graph spectral sparsification method to approximate the propagation patterns of spectral GNNs. We prove that our proposed method generates Laplacian sparsifiers that can approximate both fixed and learnable polynomial filters with theoretical guarantees. Our method allows the application of linear layers on the input node features, enabling end-to-end training as well as the handling of raw text features. We conduct an extensive experimental analysis on datasets spanning various graph scales and properties to demonstrate the superior efficiency and effectiveness of our method. The results show that our method yields superior results in comparison with the corresponding approximated base models, especially on dataset Ogbn-papers100M(111M nodes, 1.6B edges) and MAG-scholar-C (2.8M features).
