Table of Contents
Fetching ...

Load Balancing in Strongly Inhomogeneous Simulations -- a Vlasiator Case Study

Leo Kotipalo, Markus Battarbee, Yann Pfau-Kempf, Vertti Tarvus, Minna Palmroth

TL;DR

The paper addresses load balancing in strongly inhomogeneous, large-scale Vlasiator simulations by comparing graph/hypergraph partitioning (PHG), recursive bisection (RCB/RIB), and Hilbert space-filling curves (HSFC). It shows that HSFC, particularly with the Beta curve, yields superior spatial locality and overall performance, outperforming PHG, RCB, and RIB across two test scales. The study includes detailed methodology on two runs, extensive trials, and analysis of memory and communication costs, establishing HSFC as the preferred strategy for Vlasiator-like workloads. These findings offer practical guidance for balancing complex, adaptive-grid astrophysical simulations on modern HPC systems, where communication overhead and ghost-domain proliferation are critical constraints.

Abstract

Parallelization is a necessity for large-scale simulations due to the amount of data processed. In this article we investigate different load balancing methods using Vlasiator, a global magnetospheric simulation as our case study. The theoretical basis for load balancing is the (hyper)graph partitioning problem, modeling simulation units as vertices and their data dependencies as edges. As it is an NP-hard problem, heuristics are necessary for dynamic runtime balancing. We consider first hypergraph partitioning via an algorithm called parallel hypergraph partitioner (PHG); this is done by partitioning a simplified grid and then attempting to optimize the solution on the finer grid. The second and third are the geometric methods of recursive coordinate bisection (RCB) and recursive inertial bisection (RIB). Finally we consider the method of Hilbert space filling curves (HSFC). The algorithm projects simulation cells along a Hilbert curve and makes cuts along the curve. This works well due to the excellent locality of Hilbert curves, and can be optimized further by choice of curve. We introduce and investigate six three-dimensional Hilbert curves in total. Our findings on runs of two different scales indicate the HSFC method provides optimal load balance, followed by RIB and PHG methods and finally by RCB. Of the Hilbert curves evaluated, the Beta curve outperformed the most commonly used curve by a few percent.

Load Balancing in Strongly Inhomogeneous Simulations -- a Vlasiator Case Study

TL;DR

The paper addresses load balancing in strongly inhomogeneous, large-scale Vlasiator simulations by comparing graph/hypergraph partitioning (PHG), recursive bisection (RCB/RIB), and Hilbert space-filling curves (HSFC). It shows that HSFC, particularly with the Beta curve, yields superior spatial locality and overall performance, outperforming PHG, RCB, and RIB across two test scales. The study includes detailed methodology on two runs, extensive trials, and analysis of memory and communication costs, establishing HSFC as the preferred strategy for Vlasiator-like workloads. These findings offer practical guidance for balancing complex, adaptive-grid astrophysical simulations on modern HPC systems, where communication overhead and ghost-domain proliferation are critical constraints.

Abstract

Parallelization is a necessity for large-scale simulations due to the amount of data processed. In this article we investigate different load balancing methods using Vlasiator, a global magnetospheric simulation as our case study. The theoretical basis for load balancing is the (hyper)graph partitioning problem, modeling simulation units as vertices and their data dependencies as edges. As it is an NP-hard problem, heuristics are necessary for dynamic runtime balancing. We consider first hypergraph partitioning via an algorithm called parallel hypergraph partitioner (PHG); this is done by partitioning a simplified grid and then attempting to optimize the solution on the finer grid. The second and third are the geometric methods of recursive coordinate bisection (RCB) and recursive inertial bisection (RIB). Finally we consider the method of Hilbert space filling curves (HSFC). The algorithm projects simulation cells along a Hilbert curve and makes cuts along the curve. This works well due to the excellent locality of Hilbert curves, and can be optimized further by choice of curve. We introduce and investigate six three-dimensional Hilbert curves in total. Our findings on runs of two different scales indicate the HSFC method provides optimal load balance, followed by RIB and PHG methods and finally by RCB. Of the Hilbert curves evaluated, the Beta curve outperformed the most commonly used curve by a few percent.

Paper Structure

This paper contains 15 sections, 2 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Load balancing weights (color on a logarithmic scale) in a three-dimensional Vlasiator run, two slices with grid (black cubic mesh) overlaid. As can be seen, memory and computational load per cell can vary by two orders of magnitude in the simulation region; additionally spatial resolution can be up to eight times finer (three levels of cell-based octree refinement) in areas of interest.
  • Figure 2: Illustration of graph partitioning. The objective is to partition the vertices of a weighted graph into disjoint parts such that the vertex weights are balanced, and the weights of cut edges, edges whose vertices are in different partitions, are minimized. Here we assume uniform weights for the vertices and the edges, respectively.
  • Figure 3: Examples of PHG graph (a) and hypergraph (b) partitioning, one slice of a three-dimensional simulation with 8000.0 processes. Domains are colored by process number modulo 64 to increase contrast between neighboring domains. Computationally heavy regions can be discerned by the smaller domain sizes, an example being $x < 25R_E$, $\abs{y} < 20R_E$. The seemingly discontinuous regions seen around $x > 25R_E$ in Hypergraph are due to this being a two-dimensional slice; these are likely part of a domain on another plane.
  • Figure 4: Examples of recursive coordinate (a) and inertial bisection (b), one slice of a three-dimensional simulation with 8000.0 processes. Domains are colored by process number modulo 64. The difference between rectilinear blocks in RCB and oblique 'shards' in RIB is clear. Note that the lines in RCB aren't entirely straight; here the weight balance is improved at the cost of rectilinearity.
  • Figure 5: The third order approximation of the two-dimensional Z-curve (a) and the first three orders of approximation of the two-dimensional Hilbert curve in fuchsia, cyan and yellow (b). Subsequent orders of approximation can be formed by replacing quadrants with a rotation of the first order curve.
  • ...and 8 more figures