Learning on Large Graphs using Intersecting Communities
Ben Finkelshtein, İsmail İlkan Ceylan, Michael Bronstein, Ron Levie
TL;DR
This work tackles the memory bottleneck of graph neural networks on large graphs by introducing intersecting community graphs (ICG) to approximate input graphs with a fixed, low-rank structure. It proves a constructive, semi-regularity result that guarantees small cut-metric error when approximating a graph by a low-rank ICG by minimizing Frobenius error, yielding a community count $K=O( ext{poly}(1/\epsilon))$ independent of graph size for dense graphs. Learning then proceeds in two stages: offline ICG fitting with gradient-based methods (and Subgraph SGD for scalability) and online learning on the ICG plus node signals using novel ICG-NN architectures that operate in $O(N)$ time per layer, in contrast to MP-GNNs' $O(E)$. Empirically, the approach delivers competitive or state-of-the-art performance on node classification and spatio-temporal tasks while offering substantial runtime and memory benefits, especially on very large graphs. The framework also supports efficient learning on dynamic graphs through the spatio-temporal extension, making it promising for real-world dense networks where traditional GNNs struggle with memory constraints.
Abstract
Message Passing Neural Networks (MPNNs) are a staple of graph machine learning. MPNNs iteratively update each node's representation in an input graph by aggregating messages from the node's neighbors, which necessitates a memory complexity of the order of the number of graph edges. This complexity might quickly become prohibitive for large graphs provided they are not very sparse. In this paper, we propose a novel approach to alleviate this problem by approximating the input graph as an intersecting community graph (ICG) -- a combination of intersecting cliques. The key insight is that the number of communities required to approximate a graph does not depend on the graph size. We develop a new constructive version of the Weak Graph Regularity Lemma to efficiently construct an approximating ICG for any input graph. We then devise an efficient graph learning algorithm operating directly on ICG in linear memory and time with respect to the number of nodes (rather than edges). This offers a new and fundamentally different pipeline for learning on very large non-sparse graphs, whose applicability is demonstrated empirically on node classification tasks and spatio-temporal data processing.
