Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning
Lequan Lin, Dai Shi, Andi Han, Zhiyong Wang, Junbin Gao
TL;DR
This work tackles the costly hyperparameter tuning required to push GNNs to top performance. It proposes GNN-Diff, a graph-conditioned latent diffusion framework that learns to generate high-quality GNN parameters from checkpoints produced by a lightweight coarse search, using a parameter autoencoder, a graph autoencoder, and a graph-conditioned diffusion model. Across 166 experiments on four tasks and 20 datasets, GNN-Diff consistently boosts performance, improves stability on unseen data, and reduces tuning time relative to grid or random search, especially on large and long-range graphs. The approach offers practical impact by enabling near- or better-than-grid performance with substantially lower tuning effort, and it lays groundwork for extending diffusion-based parameter generation to broader graph tasks and architectures.
Abstract
Graph Neural Networks (GNNs) are proficient in graph representation learning and achieve promising performance on versatile tasks such as node classification and link prediction. Usually, a comprehensive hyperparameter tuning is essential for fully unlocking GNN's top performance, especially for complicated tasks such as node classification on large graphs and long-range graphs. This is usually associated with high computational and time costs and careful design of appropriate search spaces. This work introduces a graph-conditioned latent diffusion framework (GNN-Diff) to generate high-performing GNNs based on the model checkpoints of sub-optimal hyperparameters selected by a light-tuning coarse search. We validate our method through 166 experiments across four graph tasks: node classification on small, large, and long-range graphs, as well as link prediction. Our experiments involve 10 classic and state-of-the-art target models and 20 publicly available datasets. The results consistently demonstrate that GNN-Diff: (1) boosts the performance of GNNs with efficient hyperparameter tuning; and (2) presents high stability and generalizability on unseen data across multiple generation runs. The code is available at https://github.com/lequanlin/GNN-Diff.
