Julia GraphBLAS with Nonblocking Execution
Pascal Costanza, Timothy G. Mattson, Raye Kimmerer, Benjamin Brock
TL;DR
The paper tackles enabling aggressive nonblocking execution for GraphBLAS by implementing AppGrB, a Julia-based system that represents GraphBLAS computations as non-materialized method trees and uses multi-stage programming to generate and JIT-compile specialized kernels when a wait triggers materialization. It demonstrates PageRank on a large graph dataset, showing that nonblocking execution improves performance over blocking and highlights remaining gaps relative to hand-optimized C++ and SuiteSparse baselines. The work contributes a practical framework for DAG fusion, kernel specialization, and sparsity-aware planning within GraphBLAS and offers directions for broader coverage and integration with optimized libraries. Overall, AppGrB validates the viability of compiler-driven nonblocking GraphBLAS in Julia and points to meaningful performance gains and future research avenues in code generation and fusion techniques.
Abstract
From the beginning, the GraphBLAS were designed for ``nonblocking execution''; i.e., calls to GraphBLAS methods return as soon as the arguments to the methods are validated and define a directed acyclic graph (DAG) of GraphBLAS operations. This lets GraphBLAS implementations fuse functions, elide unneeded objects, exploit parallelism, plus any additional DAG-preserving transformations. GraphBLAS implementations exist that utilize nonblocking execution but with limited scope. In this paper, we describe our work to implement GraphBLAS with support for aggressive nonblocking execution. We show how features of the Julia programming language greatly simplify implementation of nonblocking execution. This is \emph{work-in-progress} sufficient to show the potential for nonblocking execution and is limited to GraphBLAS methods required to support PageRank.
