Table of Contents
Fetching ...

Survey on Characterizing and Understanding GNNs from a Computer Architecture Perspective

Meng Wu, Mingyu Yan, Wenming Li, Xiaochun Ye, Dongrui Fan, Yuan Xie

TL;DR

This survey consolidates efforts to characterize GNNs from a computer-architecture perspective, introducing a triple-level taxonomy that clasifies work by GNN type (SHoGNN, DHoGNN, SHeGNN, DHeGNN) and by six architectural dimensions (Time, Parallelism, Compute, Memory, Communication, Execution). It then systematically aggregates findings across the four GNN families, detailing bottlenecks such as CPU-GPU data movement, irregular memory accesses, metapath-based computations, and temporal dynamics, while highlighting parallelism opportunities and execution bounds. The work also discusses a case study on hardware acceleration (e.g., HyGCN) and outlines practical challenges like graph heterogeneity, partitioning, and real-time processing, guiding future hardware/software optimizations. Overall, the paper provides a comprehensive framework for understanding and optimizing GNN performance on parallel and distributed systems, with clear directions for scalability, cross-domain integration, and automation tooling.

Abstract

Characterizing and understanding graph neural networks (GNNs) is essential for identifying performance bottlenecks and facilitating their deployment in parallel and distributed systems. Despite substantial work in this area, a comprehensive survey on characterizing and understanding GNNs from a computer architecture perspective is lacking. This work presents a comprehensive survey, proposing a triple-level classification method to categorize, summarize, and compare existing efforts, particularly focusing on their implications for parallel architectures and distributed systems. We identify promising future directions for GNN characterization that align with the challenges of optimizing hardware and software in parallel and distributed systems. Our survey aims to help scholars systematically understand GNN performance bottlenecks and execution patterns from a computer architecture perspective, thereby contributing to the development of more efficient GNN implementations across diverse parallel architectures and distributed systems.

Survey on Characterizing and Understanding GNNs from a Computer Architecture Perspective

TL;DR

This survey consolidates efforts to characterize GNNs from a computer-architecture perspective, introducing a triple-level taxonomy that clasifies work by GNN type (SHoGNN, DHoGNN, SHeGNN, DHeGNN) and by six architectural dimensions (Time, Parallelism, Compute, Memory, Communication, Execution). It then systematically aggregates findings across the four GNN families, detailing bottlenecks such as CPU-GPU data movement, irregular memory accesses, metapath-based computations, and temporal dynamics, while highlighting parallelism opportunities and execution bounds. The work also discusses a case study on hardware acceleration (e.g., HyGCN) and outlines practical challenges like graph heterogeneity, partitioning, and real-time processing, guiding future hardware/software optimizations. Overall, the paper provides a comprehensive framework for understanding and optimizing GNN performance on parallel and distributed systems, with clear directions for scalability, cross-domain integration, and automation tooling.

Abstract

Characterizing and understanding graph neural networks (GNNs) is essential for identifying performance bottlenecks and facilitating their deployment in parallel and distributed systems. Despite substantial work in this area, a comprehensive survey on characterizing and understanding GNNs from a computer architecture perspective is lacking. This work presents a comprehensive survey, proposing a triple-level classification method to categorize, summarize, and compare existing efforts, particularly focusing on their implications for parallel architectures and distributed systems. We identify promising future directions for GNN characterization that align with the challenges of optimizing hardware and software in parallel and distributed systems. Our survey aims to help scholars systematically understand GNN performance bottlenecks and execution patterns from a computer architecture perspective, thereby contributing to the development of more efficient GNN implementations across diverse parallel architectures and distributed systems.
Paper Structure (37 sections, 6 figures, 2 tables)

This paper contains 37 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Graphs and GNNs: (a) Static homogeneous graph and SHoGNN, (b) dynamic homogeneous graph and DHoGNN, and (c) static heterogeneous graph and SHeGNN.
  • Figure 2: Workflow of mini-batch GNN training.
  • Figure 3: Methodology of taxonomy.
  • Figure 4: Time complexity quadrants of typical SHoGNNs empirical_analysis_gnn_runtime_gpus.
  • Figure 5: Parallelism examples in SHoGNNs: (a) Inter-mini-batch parallelism, (b) inter-subgraph parallelism, (c) inter-layer parallelism, (d) inter-edge parallelism, and (e) intra-vertex parallelism (different colors represent processing by different workers).
  • ...and 1 more figures