Graph Condensation: A Survey
Xinyi Gao, Junliang Yu, Tong Chen, Guanhua Ye, Wentao Zhang, Hongzhi Yin
TL;DR
Graph condensation (GC) seeks to compress a large graph into a small, representative graph with $|\mathcal{V}'|=N'\ll N$ (compression ratio $r=N'/N$) so that GNNs trained on $\mathcal{S}$ closely match the performance on the original graph $\mathcal{T}$. The survey organizes GC research into five criteria-based categories—Effectiveness, Generalization, Efficiency, Fairness, and Robustness—and analyzes two core components: optimization strategies and condensed-graph generation, plus empirical comparisons across methods and datasets. It also surveys practical applications and open-source libraries, and discusses challenges such as condensation for diverse graphs, task-agnostic condensation, secure and explainable GC, and evaluation methodologies. The work highlights trajectory- and gradient-based optimization, kernel-based and distribution-matching approaches, and various generation strategies that collectively enable efficient, scalable, and adaptable graph learning with condensed data. Overall, GC offers a data-centric pathway to accelerate GNN training and inference while preserving accuracy across tasks and models, with broad implications for hyperparameter search, continual/incremental learning, and heterogeneous graphs.
Abstract
The rapid growth of graph data poses significant challenges in storage, transmission, and particularly the training of graph neural networks (GNNs). To address these challenges, graph condensation (GC) has emerged as an innovative solution. GC focuses on synthesizing a compact yet highly representative graph, enabling GNNs trained on it to achieve performance comparable to those trained on the original large graph. The notable efficacy of GC and its broad prospects have garnered significant attention and spurred extensive research. This survey paper provides an up-to-date and systematic overview of GC, organizing existing research into five categories aligned with critical GC evaluation criteria: effectiveness, generalization, efficiency, fairness, and robustness. To facilitate an in-depth and comprehensive understanding of GC, this paper examines various methods under each category and thoroughly discusses two essential components within GC: optimization strategies and condensed graph generation. We also empirically compare and analyze representative GC methods with diverse optimization strategies based on the five proposed GC evaluation criteria. Finally, we explore the applications of GC in various fields, outline the related open-source libraries, and highlight the present challenges and novel insights, with the aim of promoting advancements in future research. The related resources can be found at https://github.com/XYGaoG/Graph-Condensation-Papers.
