Table of Contents
Fetching ...

Adaptive Graph Coarsening for Efficient GNN Training

Rostyslav Olshevskyi, Madeline Navarro, Santiago Segarra

TL;DR

This work tackles scalability of graph neural networks by enabling graph reduction during training through adaptive, embedding-guided coarsening. It introduces Graph GK, a bilevel framework in which $K$-means clustering of node embeddings jointly informs a coarsened graph ${\mathcal G}'$ with ${|{\mathcal V}'|=K}$ via ${\mathbf A}'={\mathbf P}^{\top}{\mathbf A}{\mathbf P}$ and ${\mathbf X}'={\mathbf C}^{-1}{\mathbf P}^{\top}{\mathbf X}$, while a GNN with parameters ${\boldsymbol\Theta}$ is trained to optimize performance on the coarsened graph and propagate results back to the original nodes. The method alternates between updating ${\boldsymbol\Theta}$ and re-solving the clustering-based reduction ${\mathbf P}$ (with periodic recoarsening), enabling task-adaptive clusters that work for both homophilic and heterophilic data. Empirically, GK achieves favorable accuracy–time tradeoffs on benchmarks and provides an interpretable mapping from original nodes to supernodes, suggesting a practical path to scalable, possibly continual, graph learning. The approach advances scalable GNN training by integrating reduction directly into the learning loop and adapting clusters to the downstream task, rather than relying on preprocessing or fixed graph structure.

Abstract

We propose an adaptive graph coarsening method to jointly learn graph neural network (GNN) parameters and merge nodes via K-means clustering during training. As real-world graphs grow larger, processing them directly becomes increasingly challenging and sometimes infeasible. Tailoring algorithms to large-scale data may sacrifice performance, so we instead consider graph reduction to decrease the amount of data used during training. In particular, we propose a method to simultaneously train a GNN and coarsen its graph by partitioning nodes via K-means clustering based on their embeddings. Unlike past graph coarsening works, our approach allows us to merge nodes during training. Not only does this preclude coarsening as a preprocessing step, but our node clusters can adapt to the learning task instead of relying solely on graph connectivity and features. Thus, our method is amenable to scenarios that are challenging for other methods, such as heterophilic data. We validate our approach on both homophilic and heterophilic node classification datasets. We further visualize relationships between node embeddings and their corresponding clusters to illustrate that our coarsened graph adapts to the learning task during training.

Adaptive Graph Coarsening for Efficient GNN Training

TL;DR

This work tackles scalability of graph neural networks by enabling graph reduction during training through adaptive, embedding-guided coarsening. It introduces Graph GK, a bilevel framework in which -means clustering of node embeddings jointly informs a coarsened graph with via and , while a GNN with parameters is trained to optimize performance on the coarsened graph and propagate results back to the original nodes. The method alternates between updating and re-solving the clustering-based reduction (with periodic recoarsening), enabling task-adaptive clusters that work for both homophilic and heterophilic data. Empirically, GK achieves favorable accuracy–time tradeoffs on benchmarks and provides an interpretable mapping from original nodes to supernodes, suggesting a practical path to scalable, possibly continual, graph learning. The approach advances scalable GNN training by integrating reduction directly into the learning loop and adapting clusters to the downstream task, rather than relying on preprocessing or fixed graph structure.

Abstract

We propose an adaptive graph coarsening method to jointly learn graph neural network (GNN) parameters and merge nodes via K-means clustering during training. As real-world graphs grow larger, processing them directly becomes increasingly challenging and sometimes infeasible. Tailoring algorithms to large-scale data may sacrifice performance, so we instead consider graph reduction to decrease the amount of data used during training. In particular, we propose a method to simultaneously train a GNN and coarsen its graph by partitioning nodes via K-means clustering based on their embeddings. Unlike past graph coarsening works, our approach allows us to merge nodes during training. Not only does this preclude coarsening as a preprocessing step, but our node clusters can adapt to the learning task instead of relying solely on graph connectivity and features. Thus, our method is amenable to scenarios that are challenging for other methods, such as heterophilic data. We validate our approach on both homophilic and heterophilic node classification datasets. We further visualize relationships between node embeddings and their corresponding clusters to illustrate that our coarsened graph adapts to the learning task during training.

Paper Structure

This paper contains 8 sections, 5 equations, 1 figure, 1 table, 1 algorithm.

Figures (1)

  • Figure 1: (a)-(c) TSNE visualization of node embeddings for heterophilic Wisconsin dataset obtained from a model trained on the full, original graph ($r=1$). (a) Nodes colored by true class labels. (b) Nodes colored by AConvMatch cluster assignment. (c) Nodes colored by GK cluster assignment. (d) Validation trajectory as function of $T$ trained with Citeseer dataset ($r=1$) and with GK ($r=0.1$).