Adaptive Graph Coarsening for Efficient GNN Training
Rostyslav Olshevskyi, Madeline Navarro, Santiago Segarra
TL;DR
This work tackles scalability of graph neural networks by enabling graph reduction during training through adaptive, embedding-guided coarsening. It introduces Graph GK, a bilevel framework in which $K$-means clustering of node embeddings jointly informs a coarsened graph ${\mathcal G}'$ with ${|{\mathcal V}'|=K}$ via ${\mathbf A}'={\mathbf P}^{\top}{\mathbf A}{\mathbf P}$ and ${\mathbf X}'={\mathbf C}^{-1}{\mathbf P}^{\top}{\mathbf X}$, while a GNN with parameters ${\boldsymbol\Theta}$ is trained to optimize performance on the coarsened graph and propagate results back to the original nodes. The method alternates between updating ${\boldsymbol\Theta}$ and re-solving the clustering-based reduction ${\mathbf P}$ (with periodic recoarsening), enabling task-adaptive clusters that work for both homophilic and heterophilic data. Empirically, GK achieves favorable accuracy–time tradeoffs on benchmarks and provides an interpretable mapping from original nodes to supernodes, suggesting a practical path to scalable, possibly continual, graph learning. The approach advances scalable GNN training by integrating reduction directly into the learning loop and adapting clusters to the downstream task, rather than relying on preprocessing or fixed graph structure.
Abstract
We propose an adaptive graph coarsening method to jointly learn graph neural network (GNN) parameters and merge nodes via K-means clustering during training. As real-world graphs grow larger, processing them directly becomes increasingly challenging and sometimes infeasible. Tailoring algorithms to large-scale data may sacrifice performance, so we instead consider graph reduction to decrease the amount of data used during training. In particular, we propose a method to simultaneously train a GNN and coarsen its graph by partitioning nodes via K-means clustering based on their embeddings. Unlike past graph coarsening works, our approach allows us to merge nodes during training. Not only does this preclude coarsening as a preprocessing step, but our node clusters can adapt to the learning task instead of relying solely on graph connectivity and features. Thus, our method is amenable to scenarios that are challenging for other methods, such as heterophilic data. We validate our approach on both homophilic and heterophilic node classification datasets. We further visualize relationships between node embeddings and their corresponding clusters to illustrate that our coarsened graph adapts to the learning task during training.
