GraphSnapShot: Caching Local Structure for Fast Graph Learning

Dong Liu; Roger Waleffe; Meng Jiang; Shivaram Venkataraman

GraphSnapShot: Caching Local Structure for Fast Graph Learning

Dong Liu, Roger Waleffe, Meng Jiang, Shivaram Venkataraman

TL;DR

GraphSnapShot tackles the challenge of training on large, dynamic graphs by maintaining a centralized cache of local, $k$-hop subgraphs and a hybrid static-dynamic sampling strategy. The framework combines static preprocessing with dynamic updates, using a multi-level cache hierarchy and a mix of FCR, OTF, and shared caching to minimize disk I/O while preserving topology accuracy. Empirical results on ogbn benchmarks show substantial training time reductions and GPU memory savings compared to baselines, underlining the method's scalability and practicality for dynamic graph learning. Overall, GraphSnapShot offers a disk-cache-memory approach that intelligently balances sampling quality and computational cost, enabling efficient learning on evolving graphs in real-world applications.

Abstract

In our recent research, we have developed a framework called GraphSnapShot, which has been proven an useful tool for graph learning acceleration. GraphSnapShot is a framework for fast cache, storage, retrieval and computation for graph learning. It can quickly store and update the local topology of graph structure and allows us to track patterns in the structure of graph networks, just like take snapshots of the graphs. In experiments, GraphSnapShot shows efficiency, it can achieve up to 30% training acceleration and 73% memory reduction for lossless graph ML training compared to current baselines such as dgl.This technique is particular useful for large dynamic graph learning tasks such as social media analysis and recommendation systems to process complex relationships between entities. The code for GraphSnapShot is publicly available at https://github.com/NoakLiu/GraphSnapShot.

GraphSnapShot: Caching Local Structure for Fast Graph Learning

TL;DR

GraphSnapShot tackles the challenge of training on large, dynamic graphs by maintaining a centralized cache of local,

-hop subgraphs and a hybrid static-dynamic sampling strategy. The framework combines static preprocessing with dynamic updates, using a multi-level cache hierarchy and a mix of FCR, OTF, and shared caching to minimize disk I/O while preserving topology accuracy. Empirical results on ogbn benchmarks show substantial training time reductions and GPU memory savings compared to baselines, underlining the method's scalability and practicality for dynamic graph learning. Overall, GraphSnapShot offers a disk-cache-memory approach that intelligently balances sampling quality and computational cost, enabling efficient learning on evolving graphs in real-world applications.

GraphSnapShot: Caching Local Structure for Fast Graph Learning

TL;DR

Abstract

GraphSnapShot: Caching Local Structure for Fast Graph Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)