Table of Contents
Fetching ...

Spectral Greedy Coresets for Graph Neural Networks

Mucong Ding, Yinhan He, Jundong Li, Furong Huang

TL;DR

This paper tackles training efficiency for Graph Neural Networks on large-scale graphs by introducing Spectral Greedy Graph Coresets (SGGC), which select ego-graphs around center nodes in the graph spectral domain to approximate full-graph training loss. SGGC uses a two-stage greedy framework: a coarse spreading step via GIGA to approximate the spectral embedding and a refinement step via submodular maximization (CRAIG) to diversify topology; diffusion ego-graphs and PCA compression enable scalable coreset construction without pre-training. The authors provide a theoretical bound on the node-classification loss under a bounded-spectral-variance assumption and demonstrate strong empirical performance on ten datasets, including very large graphs, often outperforming model-based coresets and graph condensation while running faster and generalizing across GNN architectures. This work enables scalable, architecture-agnostic data condensation for GNNs, reducing training time and memory with minimal loss in accuracy, though it assumes smooth spectral embeddings and has $O(c n_t n)$ time complexity in the coreset construction.

Abstract

The ubiquity of large-scale graphs in node-classification tasks significantly hinders the real-world applications of Graph Neural Networks (GNNs). Node sampling, graph coarsening, and dataset condensation are effective strategies for enhancing data efficiency. However, owing to the interdependence of graph nodes, coreset selection, which selects subsets of the data examples, has not been successfully applied to speed up GNN training on large graphs, warranting special treatment. This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs (i.e., neighborhood subgraphs around a node) based on their spectral embeddings. We decompose the coreset selection problem for GNNs into two phases: a coarse selection of widely spread ego graphs and a refined selection to diversify their topologies. We design a greedy algorithm that approximately optimizes both objectives. Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs. Extensive experiments on ten datasets demonstrate that SGGC outperforms other coreset methods by a wide margin, generalizes well across GNN architectures, and is much faster than graph condensation.

Spectral Greedy Coresets for Graph Neural Networks

TL;DR

This paper tackles training efficiency for Graph Neural Networks on large-scale graphs by introducing Spectral Greedy Graph Coresets (SGGC), which select ego-graphs around center nodes in the graph spectral domain to approximate full-graph training loss. SGGC uses a two-stage greedy framework: a coarse spreading step via GIGA to approximate the spectral embedding and a refinement step via submodular maximization (CRAIG) to diversify topology; diffusion ego-graphs and PCA compression enable scalable coreset construction without pre-training. The authors provide a theoretical bound on the node-classification loss under a bounded-spectral-variance assumption and demonstrate strong empirical performance on ten datasets, including very large graphs, often outperforming model-based coresets and graph condensation while running faster and generalizing across GNN architectures. This work enables scalable, architecture-agnostic data condensation for GNNs, reducing training time and memory with minimal loss in accuracy, though it assumes smooth spectral embeddings and has time complexity in the coreset construction.

Abstract

The ubiquity of large-scale graphs in node-classification tasks significantly hinders the real-world applications of Graph Neural Networks (GNNs). Node sampling, graph coarsening, and dataset condensation are effective strategies for enhancing data efficiency. However, owing to the interdependence of graph nodes, coreset selection, which selects subsets of the data examples, has not been successfully applied to speed up GNN training on large graphs, warranting special treatment. This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs (i.e., neighborhood subgraphs around a node) based on their spectral embeddings. We decompose the coreset selection problem for GNNs into two phases: a coarse selection of widely spread ego graphs and a refined selection to diversify their topologies. We design a greedy algorithm that approximately optimizes both objectives. Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs. Extensive experiments on ten datasets demonstrate that SGGC outperforms other coreset methods by a wide margin, generalizes well across GNN architectures, and is much faster than graph condensation.
Paper Structure (23 sections, 7 theorems, 18 equations, 6 figures, 17 tables, 1 algorithm)

This paper contains 23 sections, 7 theorems, 18 equations, 6 figures, 17 tables, 1 algorithm.

Key Result

Theorem 1

Under all assumptions of prop:smooth-embedding, we have $\|\sum_{i\in [n_{t}]}w^\mathtt{a}_i\cdot \widetilde{Z}_i - \widetilde{Z}\|_{F}\leq M\cdot \|P\mathbf{w}^\mathtt{a}-\frac{1}{n}\mathbbm{1}\|$ for some constant $M>0$.

Figures (6)

  • Figure 1: Overview of spectral greedy graph coresets (SGGC) for efficient GNN training. SGGC processes a large graph to iteratively select ego-graphs. The assembled coreset graph facilitates fast GNN training while maintaining test performance on the original graph.
  • Figure 2: Relative standard deviation of spectral embeddings on ego-graphs $\boldsymbol{Z}_i$ across all the nodes vs. the ego-graph size $p$; see \ref{['assump:bounded-spectral-variance']}.
  • Figure 3: Conceptual diagram showing the theoretical analysis formulating the spectral greedy graph coresets (SGGC).
  • Figure 4: Spectral response of 2-layer GCNs on Cora. The spectral response corresponding to eigenvalue $\lambda_i$ is defined as $\|[U^{\mkern-1.5mu\mathsf{T}} f_\theta(A,X)]_{i,:}\|/\|[U^{\mkern-1.5mu\mathsf{T}} X]_{i,:}\|$.
  • Figure 5: Test accuracy versus the selected data size of selecting nodes and diffusion ego-graphs with/without PCA-based compression of node attributes.
  • ...and 1 more figures

Theorems & Definitions (13)

  • Theorem 1: Upper-bound on the Error Approximating Node-wise Average
  • proof
  • Theorem 2: Error-Bound on Node Classification Loss
  • proof
  • Lemma 3: Smoothness of the Spectral Representation of Ego-graph's Input Features
  • proof
  • Lemma 4: Lipschitzness of GCN in Spectral Domain
  • proof
  • Proposition 5: Smoothness of Spectral Embeddings
  • Theorem 6: Upper-bound on the Error Approximating Node-wise Average
  • ...and 3 more