Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching
Yuchen Zhang, Tianle Zhang, Kai Wang, Ziyao Guo, Yuxuan Liang, Xavier Bresson, Wei Jin, Yang You
TL;DR
This paper addresses the challenge of lossless graph condensation by identifying biases in prior trajectory-based supervision and introducing GEOM, which combines curriculum-learning-trained expert trajectories with expanding window matching to transfer diverse supervision signals into a condensed graph. A knowledge embedding extractor and theoretical analysis further justify the approach, while extensive experiments across five datasets and multiple GNN architectures show GEOM achieving lossless condensation at or below 5% and strong cross-architecture generalization. The work substantially reduces the computational burden of training GNNs on large graphs and offers a practical path toward scalable graph learning, though it relies on pre-computed expert trajectories and invites efficiency-focused future work.
Abstract
Graph condensation aims to reduce the size of a large-scale graph dataset by synthesizing a compact counterpart without sacrificing the performance of Graph Neural Networks (GNNs) trained on it, which has shed light on reducing the computational cost for training GNNs. Nevertheless, existing methods often fall short of accurately replicating the original graph for certain datasets, thereby failing to achieve the objective of lossless condensation. To understand this phenomenon, we investigate the potential reasons and reveal that the previous state-of-the-art trajectory matching method provides biased and restricted supervision signals from the original graph when optimizing the condensed one. This significantly limits both the scale and efficacy of the condensed graph. In this paper, we make the first attempt toward \textit{lossless graph condensation} by bridging the previously neglected supervision signals. Specifically, we employ a curriculum learning strategy to train expert trajectories with more diverse supervision signals from the original graph, and then effectively transfer the information into the condensed graph with expanding window matching. Moreover, we design a loss function to further extract knowledge from the expert trajectories. Theoretical analysis justifies the design of our method and extensive experiments verify its superiority across different datasets. Code is released at https://github.com/NUS-HPC-AI-Lab/GEOM.
