StarDist: A Code Generator for Distributed Graph Algorithms
Barenya Kumar Nandy, Rupesh Nasre
TL;DR
StarPlat introduces a distributed graph algorithm DSL with an MPI backend and an analysis-transformation framework that optimizes communication, neighborhood traversal, and reduction via a bulk-reduction substrate. It emphasizes reduction-exclusive statements, opportunistic caching, and cache-friendly synchronization to minimize RMA overhead. Empirical results on SSSP and CC show significant speedups over DRONE and Galois under distributed workloads, validating the approach. The work also outlines an extensible backend analyzer and future directions, including integration with graph partitioners like METIS for further scalability.
Abstract
Relational data, occurring in the real world, are often structured as graphs, which provide the logical abstraction required to make analytical derivations simpler. As graphs get larger, the irregular access patterns exhibited in most graph algorithms, hamper performance. This, along with NUMA and physical memory limits, results in scaling complexities with sequential/shared memory frameworks. StarPlat's MPI backend abstracts away the programmatic complexity involved in designing optimal distributed graph algorithms. It provides an instrument for coding graph algorithms that scale over distributed memory. In this work, we provide an analysis-transformation framework that leverages general semantics associated with iterations involving nodes and their neighbors, within StarPlat, to aggregate communication. The framework scans for patterns that warrant re-ordering in neighborhood access patterns, aggregate communication, and avoid communication altogether with opportunistic caching in reduction constructs. We also architect an optimized bulk-reduction substrate using Open MPI's passive Remote Memory Access (RMA) constructs. We applied our optimization logic to StarPlat's distributed backend and outperformed d-Galois by 2.05 and DRONE by 1.44 times in Single Source Shortest Paths across several big data graphs.
