Enhanced Soups for Graph Neural Networks
Joseph Zuber, Aishwarya Sarkar, Joseph Jennings, Ali Jannesari
TL;DR
This work tackles the scalability of applying model soups to Graph Neural Networks (GNNs) by introducing Learned Souping (LS), a gradient-descent-based method that learns layerwise interpolation weights to form a single, high-performing soup, and Partition Learned Souping (PLS), a memory-efficient, partition-based variant. LS forms $W_{ ext{soup}}^l = \sum_{i=1}^N \alpha_i^l W_i^l$ and optimizes $\alpha_i^l$ via SGD with cosine annealing, while PLS partitions the graph into $K$ parts and builds subgraphs from $R$ partitions per epoch to constrain memory. Evaluations on Flickr, Reddit, ogbn-arxiv, and ogbn-products across GCN, GAT, and GraphSAGE show LS delivering up to $1.2\%$ accuracy gains and $2.1\times$ speedups, with PLS achieving up to $76\%$ memory reduction and $24.5\times$ speedups on large graphs without accuracy loss. Together, LS and PLS offer scalable, memory-efficient GNN souping that can extend the benefits of high-performing models to large-scale graphs and resource-constrained settings.
Abstract
Graph Neural Networks (GNN) have demonstrated state-of-the-art performance in numerous scientific and high-performance computing (HPC) applications. Recent work suggests that "souping" (combining) individually trained GNNs into a single model can improve performance without increasing compute and memory costs during inference. However, existing souping algorithms are often slow and memory-intensive, which limits their scalability. We introduce Learned Souping for GNNs, a gradient-descent-based souping strategy that substantially reduces time and memory overhead compared to existing methods. Our approach is evaluated across multiple Open Graph Benchmark (OGB) datasets and GNN architectures, achieving up to 1.2% accuracy improvement and 2.1X speedup. Additionally, we propose Partition Learned Souping, a novel partition-based variant of learned souping that significantly reduces memory usage. On the ogbn-products dataset with GraphSAGE, partition learned souping achieves a 24.5X speedup and a 76% memory reduction without compromising accuracy.
