Table of Contents
Fetching ...

A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation

Mohammad Hashemi, Shengbo Gong, Juntong Ni, Wenqi Fan, B. Aditya Prakash, Wei Jin

TL;DR

This survey formalizes graph reduction as producing a smaller graph $G'=(\mathcal{V}',\mathcal{E}')$ from $G=(\mathcal{V},\mathcal{E})$ (with features and Laplacians) while preserving essential information, and organizes methods into three families: sparsification, coarsening, and condensation. It introduces a unified framework and a hierarchical taxonomy to compare algorithms, reviews representative techniques across properties and task-preservation objectives, and discusses applications in NAS, continual learning, visualization, privacy, and data augmentation. The paper highlights key condensation approaches based on gradient matching, kernel ridge regression, and distribution matching, and contrasts them with traditional sparsification and coarsening, while addressing fairness, scalability, and generalization across architectures. It also identifies critical open challenges, including comprehensive evaluation, scalability of condensation, interpretability of condensation processes, distribution-shift robustness, and extending methods to diverse graph types. Overall, the work provides a foundational reference that connects graph reduction with practical GNN workflows and dataset distillation, guiding future research toward scalable, interpretable, and broadly applicable graph reduction techniques.

Abstract

Many real-world datasets can be naturally represented as graphs, spanning a wide range of domains. However, the increasing complexity and size of graph datasets present significant challenges for analysis and computation. In response, graph reduction, or graph summarization, has gained prominence for simplifying large graphs while preserving essential properties. In this survey, we aim to provide a comprehensive understanding of graph reduction methods, including graph sparsification, graph coarsening, and graph condensation. Specifically, we establish a unified definition for these methods and introduce a hierarchical taxonomy to categorize the challenges they address. Our survey then systematically reviews the technical details of these methods and emphasizes their practical applications across diverse scenarios. Furthermore, we outline critical research directions to ensure the continued effectiveness of graph reduction techniques, as well as provide a comprehensive paper list at \url{https://github.com/Emory-Melody/awesome-graph-reduction}. We hope this survey will bridge literature gaps and propel the advancement of this promising field.

A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation

TL;DR

This survey formalizes graph reduction as producing a smaller graph from (with features and Laplacians) while preserving essential information, and organizes methods into three families: sparsification, coarsening, and condensation. It introduces a unified framework and a hierarchical taxonomy to compare algorithms, reviews representative techniques across properties and task-preservation objectives, and discusses applications in NAS, continual learning, visualization, privacy, and data augmentation. The paper highlights key condensation approaches based on gradient matching, kernel ridge regression, and distribution matching, and contrasts them with traditional sparsification and coarsening, while addressing fairness, scalability, and generalization across architectures. It also identifies critical open challenges, including comprehensive evaluation, scalability of condensation, interpretability of condensation processes, distribution-shift robustness, and extending methods to diverse graph types. Overall, the work provides a foundational reference that connects graph reduction with practical GNN workflows and dataset distillation, guiding future research toward scalable, interpretable, and broadly applicable graph reduction techniques.

Abstract

Many real-world datasets can be naturally represented as graphs, spanning a wide range of domains. However, the increasing complexity and size of graph datasets present significant challenges for analysis and computation. In response, graph reduction, or graph summarization, has gained prominence for simplifying large graphs while preserving essential properties. In this survey, we aim to provide a comprehensive understanding of graph reduction methods, including graph sparsification, graph coarsening, and graph condensation. Specifically, we establish a unified definition for these methods and introduce a hierarchical taxonomy to categorize the challenges they address. Our survey then systematically reviews the technical details of these methods and emphasizes their practical applications across diverse scenarios. Furthermore, we outline critical research directions to ensure the continued effectiveness of graph reduction techniques, as well as provide a comprehensive paper list at \url{https://github.com/Emory-Melody/awesome-graph-reduction}. We hope this survey will bridge literature gaps and propel the advancement of this promising field.
Paper Structure (32 sections, 16 equations, 3 figures, 3 tables)

This paper contains 32 sections, 16 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: A general framework of graph reduction. Graph reduction aims to find a reduced (smaller) graph dataset that can preserve certain information of the original graph dataset.
  • Figure 2: Illustration of key differences among three strategies of graph reduction: Graph sparsification selects significant nodes and edges while discarding others, graph coarsening groups and aggregates similar nodes and edges to construct a smaller graph, and graph condensation learns a synthetic graph from scratch.
  • Figure 3: Taxonomy of existing graph reduction methods.