GDR-HGNN: A Heterogeneous Graph Neural Networks Accelerator Frontend with Graph Decoupling and Recoupling
Runzhen Xue, Mingyu Yan, Dengke Han, Yihan Teng, Zhimin Tang, Xiaochun Ye, Dongrui Fan
TL;DR
This work addresses buffer thrashing in heterogeneous graph neural networks by analyzing HetG topology and introducing GDR-HGNN, a frontend that dynamically restructures semantic graphs through graph decoupling and recoupling to improve data locality. The frontend integrates with existing HGNN accelerators, notably HiHGNN, and uses a Decoupler/Recoupler architecture to produce locality-friendly subgraphs for accelerated processing. Evaluation across three models and datasets demonstrates dramatic speedups (up to 68.8× vs T4 and 14.6× vs A100) and substantial DRAM-access reductions (up to 91.3%), with modest hardware overhead. The approach offers a practical pathway to significantly boost HGNN performance by rethinking data layout and memory traffic rather than relying solely on compute optimizations.
Abstract
Heterogeneous Graph Neural Networks (HGNNs) have broadened the applicability of graph representation learning to heterogeneous graphs. However, the irregular memory access pattern of HGNNs leads to the buffer thrashing issue in HGNN accelerators. In this work, we identify an opportunity to address buffer thrashing in HGNN acceleration through an analysis of the topology of heterogeneous graphs. To harvest this opportunity, we propose a graph restructuring method and map it into a hardware frontend named GDR-HGNN. GDR-HGNN dynamically restructures the graph on the fly to enhance data locality for HGNN accelerators. Experimental results demonstrate that, with the assistance of GDR-HGNN, a leading HGNN accelerator achieves an average speedup of 14.6 times and 1.78 times compared to the state-of-the-art software framework running on A100 GPU and itself, respectively.
