Simple Graph Condensation
Zhenbang Xiao, Yu Wang, Shunyu Liu, Huiqiong Wang, Mingli Song, Tongya Zheng
TL;DR
This work tackles the high training costs of large-scale graphs by introducing Simple Graph Condensation (SimGC), a parameter-light framework that condenses a large graph into a small, informative graph for training GNNs. The method uses a pre-trained Simple Graph Convolution (SGC) on the original graph to guide condensation through layerwise representation alignment, output logit alignment, and a kernel-based feature-smoothness regularizer, optimizing the combined loss $L = \alpha L_{rep} + \beta L_{lgt} + \gamma L_{smt}$ with no external parameters. Empirical results on seven datasets show that SimGC achieves competitive or superior accuracy while accelerating condensation by up to 10x, and it demonstrates strong generalization across GNN architectures, as well as utility for neural architecture search and knowledge distillation on condensed graphs. The findings suggest SimGC offers a practical, scalable route to efficient GNN training on massive graphs without sacrificing performance, with potential extensions to heterogeneous graphs and hypergraphs in future work.
Abstract
The burdensome training costs on large-scale graphs have aroused significant interest in graph condensation, which involves tuning Graph Neural Networks (GNNs) on a small condensed graph for use on the large-scale original graph. Existing methods primarily focus on aligning key metrics between the condensed and original graphs, such as gradients, output distribution and trajectories of GNNs, yielding satisfactory performance on downstream tasks. However, these complex metrics necessitate intricate external parameters and can potentially disrupt the optimization process of the condensation graph, making the condensation process highly demanding and unstable. Motivated by the recent success of simplified models across various domains, we propose a simplified approach to metric alignment in graph condensation, aiming to reduce unnecessary complexity inherited from intricate metrics. We introduce the Simple Graph Condensation (SimGC) framework, which aligns the condensed graph with the original graph from the input layer to the prediction layer, guided by a pre-trained Simple Graph Convolution (SGC) model on the original graph. Importantly, SimGC eliminates external parameters and exclusively retains the target condensed graph during the condensation process. This straightforward yet effective strategy achieves a significant speedup of up to 10 times compared to existing graph condensation methods while performing on par with state-of-the-art baselines. Comprehensive experiments conducted on seven benchmark datasets demonstrate the effectiveness of SimGC in prediction accuracy, condensation time, and generalization capability. Our code is available at https://github.com/BangHonor/SimGC.
