Table of Contents
Fetching ...

GraphMatch: Subgraph Query Processing on FPGAs

Jonas Dann, Tobias Götz, Daniel Ritter, Jana Giceva, Holger Fröning

TL;DR

GraphMatch presents the first general-purpose FPGA accelerator for subgraph query processing based on worst-case optimal joins (WCOJ). By introducing the AllCompare intersector and adapting Leapfrog principles to FPGA hardware, it delivers high-throughput, parallel set intersections that are the bottleneck in CPU-based subgraph systems. The architecture supports dynamic queries, homomorphism and isomorphism, and scales across multiple FPGA instances with stride-based load balancing and caching. Empirical results show substantial speedups over GraphFlow ($2.68\times$) and RapidMatch ($5.16\times$) on diverse graphs, underscoring the practical impact of FPGA specialization for graph pattern matching. The work also identifies remaining challenges with degree-skew and load balancing, pointing to future enhancements in caching, work-stealing, and query-plan optimization.

Abstract

Efficiently finding subgraph embeddings in large graphs is crucial for many application areas like biology and social network analysis. Set intersections are the predominant and most challenging aspect of current join-based subgraph query processing systems for CPUs. Previous work has shown the viability of utilizing FPGAs for acceleration of graph and join processing. In this work, we propose GraphMatch, the first genearl-purpose stand-alone subgraph query processing accelerator based on worst-case optimal joins (WCOJ) that is fully designed for modern, field programmable gate array (FPGA) hardware. For efficient processing of various graph data sets and query graph patterns, it leverages a novel set intersection approach, called AllCompare, tailor-made for FPGAs. We show that this set intersection approach efficiently solves multi-set intersections in subgraph query processing, superior to CPU-based approaches. Overall, GraphMatch achieves a speedup of over 2.68x and 5.16x, compared to the state-of-the-art systems GraphFlow and RapidMatch, respectively.

GraphMatch: Subgraph Query Processing on FPGAs

TL;DR

GraphMatch presents the first general-purpose FPGA accelerator for subgraph query processing based on worst-case optimal joins (WCOJ). By introducing the AllCompare intersector and adapting Leapfrog principles to FPGA hardware, it delivers high-throughput, parallel set intersections that are the bottleneck in CPU-based subgraph systems. The architecture supports dynamic queries, homomorphism and isomorphism, and scales across multiple FPGA instances with stride-based load balancing and caching. Empirical results show substantial speedups over GraphFlow () and RapidMatch () on diverse graphs, underscoring the practical impact of FPGA specialization for graph pattern matching. The work also identifies remaining challenges with degree-skew and load balancing, pointing to future enhancements in caching, work-stealing, and query-plan optimization.

Abstract

Efficiently finding subgraph embeddings in large graphs is crucial for many application areas like biology and social network analysis. Set intersections are the predominant and most challenging aspect of current join-based subgraph query processing systems for CPUs. Previous work has shown the viability of utilizing FPGAs for acceleration of graph and join processing. In this work, we propose GraphMatch, the first genearl-purpose stand-alone subgraph query processing accelerator based on worst-case optimal joins (WCOJ) that is fully designed for modern, field programmable gate array (FPGA) hardware. For efficient processing of various graph data sets and query graph patterns, it leverages a novel set intersection approach, called AllCompare, tailor-made for FPGAs. We show that this set intersection approach efficiently solves multi-set intersections in subgraph query processing, superior to CPU-based approaches. Overall, GraphMatch achieves a speedup of over 2.68x and 5.16x, compared to the state-of-the-art systems GraphFlow and RapidMatch, respectively.
Paper Structure (41 sections, 5 equations, 19 figures, 3 tables)

This paper contains 41 sections, 5 equations, 19 figures, 3 tables.

Figures (19)

  • Figure 1: RapidMatch (CPU) runtime (left), and RapidMatch and GraphMatch (FPGA) intersection operators (right).
  • Figure 2: FPGA architecture (taken from conf/damon/DannW0FF22).
  • Figure 3: Subgraph query processing example and all its isomorphisms.
  • Figure 4: LeapFrog and AllCompare intersection approaches.
  • Figure 5: AllCompare set intersector architecture.
  • ...and 14 more figures