Load Balancing in Strongly Inhomogeneous Simulations -- a Vlasiator Case Study
Leo Kotipalo, Markus Battarbee, Yann Pfau-Kempf, Vertti Tarvus, Minna Palmroth
TL;DR
The paper addresses load balancing in strongly inhomogeneous, large-scale Vlasiator simulations by comparing graph/hypergraph partitioning (PHG), recursive bisection (RCB/RIB), and Hilbert space-filling curves (HSFC). It shows that HSFC, particularly with the Beta curve, yields superior spatial locality and overall performance, outperforming PHG, RCB, and RIB across two test scales. The study includes detailed methodology on two runs, extensive trials, and analysis of memory and communication costs, establishing HSFC as the preferred strategy for Vlasiator-like workloads. These findings offer practical guidance for balancing complex, adaptive-grid astrophysical simulations on modern HPC systems, where communication overhead and ghost-domain proliferation are critical constraints.
Abstract
Parallelization is a necessity for large-scale simulations due to the amount of data processed. In this article we investigate different load balancing methods using Vlasiator, a global magnetospheric simulation as our case study. The theoretical basis for load balancing is the (hyper)graph partitioning problem, modeling simulation units as vertices and their data dependencies as edges. As it is an NP-hard problem, heuristics are necessary for dynamic runtime balancing. We consider first hypergraph partitioning via an algorithm called parallel hypergraph partitioner (PHG); this is done by partitioning a simplified grid and then attempting to optimize the solution on the finer grid. The second and third are the geometric methods of recursive coordinate bisection (RCB) and recursive inertial bisection (RIB). Finally we consider the method of Hilbert space filling curves (HSFC). The algorithm projects simulation cells along a Hilbert curve and makes cuts along the curve. This works well due to the excellent locality of Hilbert curves, and can be optimized further by choice of curve. We introduce and investigate six three-dimensional Hilbert curves in total. Our findings on runs of two different scales indicate the HSFC method provides optimal load balance, followed by RIB and PHG methods and finally by RCB. Of the Hilbert curves evaluated, the Beta curve outperformed the most commonly used curve by a few percent.
