Table of Contents
Fetching ...

Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching

Tianle Zhang, Yuchen Zhang, Kun Wang, Kai Wang, Beining Yang, Kaipeng Zhang, Wenqi Shao, Ping Liu, Joey Tianyi Zhou, Yang You

TL;DR

A novel graph condensation method named CTRL, which offers an optimized starting point closer to the original dataset's feature distribution and a more refined strategy for gradient matching and can effectively neutralize the impact of accumulated errors on the performance of condensed graphs.

Abstract

Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns. As one of the most promising directions, graph condensation methods address these issues by employing gradient matching, aiming to condense the full graph into a more concise yet information-rich synthetic set. Though encouraging, these strategies primarily emphasize matching directions of the gradients, which leads to deviations in the training trajectories. Such deviations are further magnified by the differences between the condensation and evaluation phases, culminating in accumulated errors, which detrimentally affect the performance of the condensed graphs. In light of this, we propose a novel graph condensation method named \textbf{C}raf\textbf{T}ing \textbf{R}ationa\textbf{L} trajectory (\textbf{CTRL}), which offers an optimized starting point closer to the original dataset's feature distribution and a more refined strategy for gradient matching. Theoretically, CTRL can effectively neutralize the impact of accumulated errors on the performance of condensed graphs. We provide extensive experiments on various graph datasets and downstream tasks to support the effectiveness of CTRL. Code is released at https://github.com/NUS-HPC-AI-Lab/CTRL.

Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching

TL;DR

A novel graph condensation method named CTRL, which offers an optimized starting point closer to the original dataset's feature distribution and a more refined strategy for gradient matching and can effectively neutralize the impact of accumulated errors on the performance of condensed graphs.

Abstract

Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns. As one of the most promising directions, graph condensation methods address these issues by employing gradient matching, aiming to condense the full graph into a more concise yet information-rich synthetic set. Though encouraging, these strategies primarily emphasize matching directions of the gradients, which leads to deviations in the training trajectories. Such deviations are further magnified by the differences between the condensation and evaluation phases, culminating in accumulated errors, which detrimentally affect the performance of the condensed graphs. In light of this, we propose a novel graph condensation method named \textbf{C}raf\textbf{T}ing \textbf{R}ationa\textbf{L} trajectory (\textbf{CTRL}), which offers an optimized starting point closer to the original dataset's feature distribution and a more refined strategy for gradient matching. Theoretically, CTRL can effectively neutralize the impact of accumulated errors on the performance of condensed graphs. We provide extensive experiments on various graph datasets and downstream tasks to support the effectiveness of CTRL. Code is released at https://github.com/NUS-HPC-AI-Lab/CTRL.
Paper Structure (23 sections, 13 equations, 8 figures, 15 tables)

This paper contains 23 sections, 13 equations, 8 figures, 15 tables.

Figures (8)

  • Figure 1: (a) and (b) illustrate the gradient difference during the optimization process of the condensed graphs using CTRL and the basic gradient matching methods with three gradient discrepancy metrics, step refers to the times of calculating gradient matching losses. (c) shows the performance of the condensed graphs during the optimization process for various methods.
  • Figure 2: Sampling example
  • Figure 3: Illustration of gradien matching.
  • Figure 4: L2 norm of the matching error.
  • Figure 5: (a) and (b) show the improvement by employing the initialization method of CTRL. (c) indicates the sensitivity analysis experiments conducted on Cora, Citeseer, and Ogbn-arxiv, with condensation ratios of 1.3%, 0.9%, and 0.25%, respectively.
  • ...and 3 more figures

Theorems & Definitions (2)

  • proof : Proof of Theorem 2.4
  • proof : Proof of corollary 2.5