GNNBENCH: Fair and Productive Benchmarking for Single-GPU GNN System
Yidong Gong, Pradeep Kumar
TL;DR
GnnBench addresses the lack of a standardized benchmark for single-GPU GNN systems by delivering a framework-agnostic benchmarking platform with stable System APIs and zero-copy tensor exchange via a Producer-Only DLPack protocol. It enables plug-and-play integration of diverse GNN kernels, automatically generates integration code through a DSL, and evaluates multiple existing systems to reveal accuracy pitfalls, framework overhead, and memory behavior. The experimental results show that many prior systems have accuracy and performance issues that are mitigated or clarified when benchmarked with GnnBench, and that framework overhead can dominate small-dataset runtimes, while mid-size datasets reveal true kernel performance. The work demonstrates the platform’s practicality and versatility across PyTorch and TensorFlow, offering a fair baseline for future GNN innovations and guiding decisions about kernel fusion versus native implementations.
Abstract
We hypothesize that the absence of a standardized benchmark has allowed several fundamental pitfalls in GNN System design and evaluation that the community has overlooked. In this work, we propose GNNBench, a plug-and-play benchmarking platform focused on system innovation. GNNBench presents a new protocol to exchange their captive tensor data, supports custom classes in System APIs, and allows automatic integration of the same system module to many deep learning frameworks, such as PyTorch and TensorFlow. To demonstrate the importance of such a benchmark framework, we integrated several GNN systems. Our results show that integration with GNNBench helped us identify several measurement issues that deserve attention from the community.
