Table of Contents
Fetching ...

Graph Condensation: A Survey

Xinyi Gao, Junliang Yu, Tong Chen, Guanhua Ye, Wentao Zhang, Hongzhi Yin

TL;DR

Graph condensation (GC) seeks to compress a large graph into a small, representative graph with $|\mathcal{V}'|=N'\ll N$ (compression ratio $r=N'/N$) so that GNNs trained on $\mathcal{S}$ closely match the performance on the original graph $\mathcal{T}$. The survey organizes GC research into five criteria-based categories—Effectiveness, Generalization, Efficiency, Fairness, and Robustness—and analyzes two core components: optimization strategies and condensed-graph generation, plus empirical comparisons across methods and datasets. It also surveys practical applications and open-source libraries, and discusses challenges such as condensation for diverse graphs, task-agnostic condensation, secure and explainable GC, and evaluation methodologies. The work highlights trajectory- and gradient-based optimization, kernel-based and distribution-matching approaches, and various generation strategies that collectively enable efficient, scalable, and adaptable graph learning with condensed data. Overall, GC offers a data-centric pathway to accelerate GNN training and inference while preserving accuracy across tasks and models, with broad implications for hyperparameter search, continual/incremental learning, and heterogeneous graphs.

Abstract

The rapid growth of graph data poses significant challenges in storage, transmission, and particularly the training of graph neural networks (GNNs). To address these challenges, graph condensation (GC) has emerged as an innovative solution. GC focuses on synthesizing a compact yet highly representative graph, enabling GNNs trained on it to achieve performance comparable to those trained on the original large graph. The notable efficacy of GC and its broad prospects have garnered significant attention and spurred extensive research. This survey paper provides an up-to-date and systematic overview of GC, organizing existing research into five categories aligned with critical GC evaluation criteria: effectiveness, generalization, efficiency, fairness, and robustness. To facilitate an in-depth and comprehensive understanding of GC, this paper examines various methods under each category and thoroughly discusses two essential components within GC: optimization strategies and condensed graph generation. We also empirically compare and analyze representative GC methods with diverse optimization strategies based on the five proposed GC evaluation criteria. Finally, we explore the applications of GC in various fields, outline the related open-source libraries, and highlight the present challenges and novel insights, with the aim of promoting advancements in future research. The related resources can be found at https://github.com/XYGaoG/Graph-Condensation-Papers.

Graph Condensation: A Survey

TL;DR

Graph condensation (GC) seeks to compress a large graph into a small, representative graph with (compression ratio ) so that GNNs trained on closely match the performance on the original graph . The survey organizes GC research into five criteria-based categories—Effectiveness, Generalization, Efficiency, Fairness, and Robustness—and analyzes two core components: optimization strategies and condensed-graph generation, plus empirical comparisons across methods and datasets. It also surveys practical applications and open-source libraries, and discusses challenges such as condensation for diverse graphs, task-agnostic condensation, secure and explainable GC, and evaluation methodologies. The work highlights trajectory- and gradient-based optimization, kernel-based and distribution-matching approaches, and various generation strategies that collectively enable efficient, scalable, and adaptable graph learning with condensed data. Overall, GC offers a data-centric pathway to accelerate GNN training and inference while preserving accuracy across tasks and models, with broad implications for hyperparameter search, continual/incremental learning, and heterogeneous graphs.

Abstract

The rapid growth of graph data poses significant challenges in storage, transmission, and particularly the training of graph neural networks (GNNs). To address these challenges, graph condensation (GC) has emerged as an innovative solution. GC focuses on synthesizing a compact yet highly representative graph, enabling GNNs trained on it to achieve performance comparable to those trained on the original large graph. The notable efficacy of GC and its broad prospects have garnered significant attention and spurred extensive research. This survey paper provides an up-to-date and systematic overview of GC, organizing existing research into five categories aligned with critical GC evaluation criteria: effectiveness, generalization, efficiency, fairness, and robustness. To facilitate an in-depth and comprehensive understanding of GC, this paper examines various methods under each category and thoroughly discusses two essential components within GC: optimization strategies and condensed graph generation. We also empirically compare and analyze representative GC methods with diverse optimization strategies based on the five proposed GC evaluation criteria. Finally, we explore the applications of GC in various fields, outline the related open-source libraries, and highlight the present challenges and novel insights, with the aim of promoting advancements in future research. The related resources can be found at https://github.com/XYGaoG/Graph-Condensation-Papers.
Paper Structure (45 sections, 12 equations, 10 figures, 7 tables, 1 algorithm)

This paper contains 45 sections, 12 equations, 10 figures, 7 tables, 1 algorithm.

Figures (10)

  • Figure 1: An overview for graph condensation. Graph condensation aims to generate small informative graphs such that the models trained on these graphs have similar downstream task performance to those trained on the original graphs. GC methods can be categorised into five classes aligned with critical GC evaluation criteria.
  • Figure 2: Graph condensation procedure. The original graph and condensed graph are encoded by the relay model $f_{\theta}$ and the condensed graph is optimized according to $\mathcal{L}_{cond}$.
  • Figure 3: The taxonomy of graph condensation.
  • Figure 4: The research focuses of graph condensation literature.
  • Figure 5: Effective graph condensation (GC). The condensed graph enables GNNs to be trained with performance comparable to GNNs trained on the original graph.
  • ...and 5 more figures