Table of Contents
Fetching ...

Graph Condensation for Graph Neural Networks

Wei Jin, Lingxiao Zhao, Shichang Zhang, Yozen Liu, Jiliang Tang, Neil Shah

TL;DR

This paper introduces graph condensation for graph neural networks (GNNs), aiming to shrink large graphs into small synthetic graphs without sacrificing performance. It proposes GCond, a gradient-matching framework that learns a condensed graph by coupling node features with a learned structure, enabling the condensed graph to mimic the original training trajectory. Across transductive and inductive datasets, GCond achieves high fidelity to original performance with massive data reduction and demonstrates transferability across diverse GNN architectures, as well as applicability to neural architecture search. The work offers a scalable path to efficient GNN training and NAS on large-scale graphs, with code released for reproducibility.

Abstract

Given the prevalence of large-scale graphs in real-world applications, the storage and time for training neural models have raised increasing concerns. To alleviate the concerns, we propose and study the problem of graph condensation for graph neural networks (GNNs). Specifically, we aim to condense the large, original graph into a small, synthetic and highly-informative graph, such that GNNs trained on the small graph and large graph have comparable performance. We approach the condensation problem by imitating the GNN training trajectory on the original graph through the optimization of a gradient matching loss and design a strategy to condense node futures and structural information simultaneously. Extensive experiments have demonstrated the effectiveness of the proposed framework in condensing different graph datasets into informative smaller graphs. In particular, we are able to approximate the original test accuracy by 95.3% on Reddit, 99.8% on Flickr and 99.0% on Citeseer, while reducing their graph size by more than 99.9%, and the condensed graphs can be used to train various GNN architectures.Code is released at https://github.com/ChandlerBang/GCond.

Graph Condensation for Graph Neural Networks

TL;DR

This paper introduces graph condensation for graph neural networks (GNNs), aiming to shrink large graphs into small synthetic graphs without sacrificing performance. It proposes GCond, a gradient-matching framework that learns a condensed graph by coupling node features with a learned structure, enabling the condensed graph to mimic the original training trajectory. Across transductive and inductive datasets, GCond achieves high fidelity to original performance with massive data reduction and demonstrates transferability across diverse GNN architectures, as well as applicability to neural architecture search. The work offers a scalable path to efficient GNN training and NAS on large-scale graphs, with code released for reproducibility.

Abstract

Given the prevalence of large-scale graphs in real-world applications, the storage and time for training neural models have raised increasing concerns. To alleviate the concerns, we propose and study the problem of graph condensation for graph neural networks (GNNs). Specifically, we aim to condense the large, original graph into a small, synthetic and highly-informative graph, such that GNNs trained on the small graph and large graph have comparable performance. We approach the condensation problem by imitating the GNN training trajectory on the original graph through the optimization of a gradient matching loss and design a strategy to condense node futures and structural information simultaneously. Extensive experiments have demonstrated the effectiveness of the proposed framework in condensing different graph datasets into informative smaller graphs. In particular, we are able to approximate the original test accuracy by 95.3% on Reddit, 99.8% on Flickr and 99.0% on Citeseer, while reducing their graph size by more than 99.9%, and the condensed graphs can be used to train various GNN architectures.Code is released at https://github.com/ChandlerBang/GCond.

Paper Structure

This paper contains 26 sections, 8 equations, 4 figures, 15 tables, 1 algorithm.

Figures (4)

  • Figure 1: We study the graph condensation problem, which seeks to learn a small, synthetic graph, features and labels $\{{\bf A'}, {\bf X'},{\bf Y'}\}$ from a large, original dataset $\{{\bf A}, {\bf X},{\bf Y}\}$, which can be used to train GNN models that generalize comparably to the original. Shown: An illustration of our proposed GCond graph condensation approach's empirical performance, which exhibits 95.3% of original graph test performance with 99.9% data reduction.
  • Figure 2: Condensed graphs sometimes exhibit structure mimicking the original (a, b, d). Other times (c, e), learned features absorb graph properties and create less explicit graph reliance.
  • Figure 3: Test accuracy and sparsity under different values of $\delta$.
  • Figure 4: The t-SNE plots of node features in condensed graphs.