SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training

Xizhi Gu; Hongzheng Li; Shihong Gao; Xinyan Zhang; Lei Chen; Yingxia Shao

SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training

Xizhi Gu, Hongzheng Li, Shihong Gao, Xinyan Zhang, Lei Chen, Yingxia Shao

TL;DR

SpanGNN tackles the memory bottleneck of full-graph GNN training by using a sequence of spanning subgraphs and incrementally updating edges under an upper memory bound $\alpha_{up}$. It introduces fast quality-aware edge selection with variance-minimized and gradient-noise reduced sampling, plus a two-step sampling scheme to scale to large graphs, aligning training with curriculum-learning principles. Empirical results on large datasets show substantial peak-memory reductions (often $>40\%$) with accuracy close to full-graph training, and competitive performance versus mini-batch methods. This approach enables scalable, high-accuracy GNN training on very large graphs where traditional full-graph or mini-batch methods struggle.

Abstract

Graph Neural Networks (GNNs) have superior capability in learning graph data. Full-graph GNN training generally has high accuracy, however, it suffers from large peak memory usage and encounters the Out-of-Memory problem when handling large graphs. To address this memory problem, a popular solution is mini-batch GNN training. However, mini-batch GNN training increases the training variance and sacrifices the model accuracy. In this paper, we propose a new memory-efficient GNN training method using spanning subgraph, called SpanGNN. SpanGNN trains GNN models over a sequence of spanning subgraphs, which are constructed from empty structure. To overcome the excessive peak memory consumption problem, SpanGNN selects a set of edges from the original graph to incrementally update the spanning subgraph between every epoch. To ensure the model accuracy, we introduce two types of edge sampling strategies (i.e., variance-reduced and noise-reduced), and help SpanGNN select high-quality edges for the GNN learning. We conduct experiments with SpanGNN on widely used datasets, demonstrating SpanGNN's advantages in the model performance and low peak memory usage.

SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training

TL;DR

SpanGNN tackles the memory bottleneck of full-graph GNN training by using a sequence of spanning subgraphs and incrementally updating edges under an upper memory bound

. It introduces fast quality-aware edge selection with variance-minimized and gradient-noise reduced sampling, plus a two-step sampling scheme to scale to large graphs, aligning training with curriculum-learning principles. Empirical results on large datasets show substantial peak-memory reductions (often

) with accuracy close to full-graph training, and competitive performance versus mini-batch methods. This approach enables scalable, high-accuracy GNN training on very large graphs where traditional full-graph or mini-batch methods struggle.

Abstract

Paper Structure (25 sections, 2 theorems, 18 equations, 6 figures, 2 tables, 2 algorithms)

This paper contains 25 sections, 2 theorems, 18 equations, 6 figures, 2 tables, 2 algorithms.

Introduction
Preliminary
Graph Neural Networks
Spanning Subgraph GNN Training
SpanGNN: Memory-Efficient Full-graph GNN Learning
Fast Quality-aware Edge Selection
Variance-minimized Sampling Strategy
Gradient Noise-reduced Sampling Strategy
Two-step Edge Sampling Method
Connection to Curriculum Learning
Experimental studies
Experimental Setups
Performance of SpanGNN
Comparison of model accuracy.
Comparison of peak memory usage.
...and 10 more sections

Key Result

theorem thmcountertheorem

Upper bound of the expected gradient noise. Given the square of Frobenius norm $\left\| P\right\|^{2}_{F}$, $\left\| H^{(l)}\right\|^{2}_{F}$, $\left\| \delta^{(l)}\right\|^{2}_{F}$ are bounded by some constants $B$, $C$, $D$ and the L2-norm $\left\| H^{(l)}W^{(l)} \right\|$ is bounded by constant $

Figures (6)

Figure 1: The framework of SpanGNN.
Figure 2: The performance of the training methods on GCN (Up-side) and SAGE (Down-side) with various edge ratios.
Figure 3: Peak Memory Usage on GCN(Up-side) and SAGE(Down-side).
Figure 4: Ablation studies on GCN(Up-side) and SAGE(Down-side)
Figure 5: The average time cost of generating spanning subgraphs.
...and 1 more figures

Theorems & Definitions (4)

theorem thmcountertheorem
proof
theorem thmcountertheorem
proof

SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training

TL;DR

Abstract

SpanGNN: Towards Memory-Efficient Graph Neural Networks via Spanning Subgraph Training

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (4)