Efficient Distributed Exact Subgraph Matching via GNN-PE: Load Balancing, Cache Optimization, and Query Plan Ranking
Yu Wang, Hui Wang, Jiake Ge, Xin Wang
TL;DR
This work tackles scalable exact subgraph matching on large graphs by extending the single-machine GNN-PE framework to a distributed setting. It combines three core innovations: lightweight correlation-aware load balancing with CRC32-based index consistency, online incremental learning-based multi-GPU collaborative caching, and PE-score driven query plan ranking to minimize cross-shard data transmission. The system employs METIS-based ultra-fine-grained partitioning and lightweight metadata management to achieve minimum edge cuts, balanced load, and non-interruptible queries, with migration overhead kept to $<50\text{ms}$. Experimental results on real and synthetic graphs show substantial gains, including up to 1–2 orders of magnitude faster query latency and 2–3× higher throughput, validating practical impact for large-scale distributed subgraph matching.
Abstract
Exact subgraph matching on large-scale graphs remains a challenging problem due to high computational complexity and distributed system constraints. Existing GNN-based path embedding (GNN-PE) frameworks achieve efficient exact matching on single machines but lack scalability and optimization for distributed environments. To address this gap, we propose three core innovations to extend GNN-PE to distributed systems: (1) a lightweight dynamic correlation-aware load balancing and hot migration mechanism that fuses multi-dimensional metrics (CPU, communication, memory) and guarantees index consistency; (2) an online incremental learning-based multi-GPU collaborative dynamic caching strategy with heterogeneous GPU adaptation and graph-structure-aware replacement; (3) a query plan ranking method driven by dominance embedding pruning potential (PE-score) that optimizes execution order. Through METIS partitioning, parallel offline preprocessing, and lightweight metadata management, our approach achieves "minimum edge cut + load balancing + non-interruptible queries" in distributed scenarios (tens of machines), significantly improving the efficiency and stability of distributed subgraph matching.
