Table of Contents
Fetching ...

GC-Bench: An Open and Unified Benchmark for Graph Condensation

Qingyun Sun, Ziying Chen, Beining Yang, Cheng Ji, Xingcheng Fu, Sheng Zhou, Hao Peng, Jianxin Li, Philip S. Yu

TL;DR

A comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation in different scenarios systematically and comprehensively evaluates 12 state-of-the-art graph condensation algorithms in node-level and graph-level tasks and analyze their performance in 12 diverse graph datasets.

Abstract

Graph condensation (GC) has recently garnered considerable attention due to its ability to reduce large-scale graph datasets while preserving their essential properties. The core concept of GC is to create a smaller, more manageable graph that retains the characteristics of the original graph. Despite the proliferation of graph condensation methods developed in recent years, there is no comprehensive evaluation and in-depth analysis, which creates a great obstacle to understanding the progress in this field. To fill this gap, we develop a comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation in different scenarios systematically. Specifically, GC-Bench systematically investigates the characteristics of graph condensation in terms of the following dimensions: effectiveness, transferability, and complexity. We comprehensively evaluate 12 state-of-the-art graph condensation algorithms in node-level and graph-level tasks and analyze their performance in 12 diverse graph datasets. Further, we have developed an easy-to-use library for training and evaluating different GC methods to facilitate reproducible research. The GC-Bench library is available at https://github.com/RingBDStack/GC-Bench.

GC-Bench: An Open and Unified Benchmark for Graph Condensation

TL;DR

A comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation in different scenarios systematically and comprehensively evaluates 12 state-of-the-art graph condensation algorithms in node-level and graph-level tasks and analyze their performance in 12 diverse graph datasets.

Abstract

Graph condensation (GC) has recently garnered considerable attention due to its ability to reduce large-scale graph datasets while preserving their essential properties. The core concept of GC is to create a smaller, more manageable graph that retains the characteristics of the original graph. Despite the proliferation of graph condensation methods developed in recent years, there is no comprehensive evaluation and in-depth analysis, which creates a great obstacle to understanding the progress in this field. To fill this gap, we develop a comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation in different scenarios systematically. Specifically, GC-Bench systematically investigates the characteristics of graph condensation in terms of the following dimensions: effectiveness, transferability, and complexity. We comprehensively evaluate 12 state-of-the-art graph condensation algorithms in node-level and graph-level tasks and analyze their performance in 12 diverse graph datasets. Further, we have developed an easy-to-use library for training and evaluating different GC methods to facilitate reproducible research. The GC-Bench library is available at https://github.com/RingBDStack/GC-Bench.
Paper Structure (45 sections, 2 equations, 9 figures, 22 tables)

This paper contains 45 sections, 2 equations, 9 figures, 22 tables.

Figures (9)

  • Figure 1: The GC methods can be broadly divided into two categories: The first category depends on the backbone model, refining the condensed graph by aligning it with the backbone's gradients (a), trajectories (b), and output distributions (c) trained on both original and condensed graphs. The second category, independent of the backbone, optimizes the graph by matching its distribution with that of the original graph data (d) or by identifying frequently co-occurring computation trees (e).
  • Figure 2: Cross-task performance on Citeseer. For all downstream tasks, the models are trained solely using data of graphs condensed by node classification. For anomaly detection (c, d), structural and contextual anomalies anomalies are injected into both the condensed graph and the original graph.
  • Figure 3: Cross-architecture performance. Using SGC and Graph Transformer (GTrans) to condense Cora with a 2.6% ratio, we then test the accuracy on various downstream architectures (a, b). Furthermore, we evaluate the influence of gradient matching steps on GCond (c).
  • Figure 4: The impact of initialization under different condensation ratios (a, b) and the impact across different datasets Cora(a, b) and ogbn-arxiv(c).
  • Figure 5: Time and memory consumption of different methods on ogbn-arxiv (0.50%).
  • ...and 4 more figures