Table of Contents
Fetching ...

A Survey on Graph Condensation

Hongjia Xu, Liangliang Zhang, Yao Ma, Sheng Zhou, Zhuonan Zheng, Bu Jiajun

TL;DR

This survey defines Graph Condensation (GC) as the process of shrinking a graph to a smaller yet informative representation that preserves downstream task performance. It introduces a formal framework with a three-way objective taxonomy (graph-guided, model-guided, hybrid) and two formulations (modification and synthetic), and it systematically analyzes datasets, evaluation metrics, and applications. The work discusses key methods, typical formulations, and practical tradeoffs between preserving graph structure and maintaining model performance, while outlining limitations such as the performance gap between formulations and efficiency considerations. It concludes with future directions focused on interpretability, extending GC to complex graph types, understanding objective correlations, and developing comprehensive tradeoff frameworks for real-world deployment.

Abstract

Analytics on large-scale graphs have posed significant challenges to computational efficiency and resource requirements. Recently, Graph condensation (GC) has emerged as a solution to address challenges arising from the escalating volume of graph data. The motivation of GC is to reduce the scale of large graphs to smaller ones while preserving essential information for downstream tasks. For a better understanding of GC and to distinguish it from other related topics, we present a formal definition of GC and establish a taxonomy that systematically categorizes existing methods into three types based on its objective, and classify the formulations to generate the condensed graphs into two categories as modifying the original graphs or synthetic completely new ones. Moreover, our survey includes a comprehensive analysis of datasets and evaluation metrics in this field. Finally, we conclude by addressing challenges and limitations, outlining future directions, and offering concise guidelines to inspire future research in this field.

A Survey on Graph Condensation

TL;DR

This survey defines Graph Condensation (GC) as the process of shrinking a graph to a smaller yet informative representation that preserves downstream task performance. It introduces a formal framework with a three-way objective taxonomy (graph-guided, model-guided, hybrid) and two formulations (modification and synthetic), and it systematically analyzes datasets, evaluation metrics, and applications. The work discusses key methods, typical formulations, and practical tradeoffs between preserving graph structure and maintaining model performance, while outlining limitations such as the performance gap between formulations and efficiency considerations. It concludes with future directions focused on interpretability, extending GC to complex graph types, understanding objective correlations, and developing comprehensive tradeoff frameworks for real-world deployment.

Abstract

Analytics on large-scale graphs have posed significant challenges to computational efficiency and resource requirements. Recently, Graph condensation (GC) has emerged as a solution to address challenges arising from the escalating volume of graph data. The motivation of GC is to reduce the scale of large graphs to smaller ones while preserving essential information for downstream tasks. For a better understanding of GC and to distinguish it from other related topics, we present a formal definition of GC and establish a taxonomy that systematically categorizes existing methods into three types based on its objective, and classify the formulations to generate the condensed graphs into two categories as modifying the original graphs or synthetic completely new ones. Moreover, our survey includes a comprehensive analysis of datasets and evaluation metrics in this field. Finally, we conclude by addressing challenges and limitations, outlining future directions, and offering concise guidelines to inspire future research in this field.
Paper Structure (43 sections, 8 equations, 1 figure, 1 table)

This paper contains 43 sections, 8 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: Overview of GC. GNNs stand for any graph machine learning model with different architectures like GCN, GAT, ..., etc.