HPX with Spack and Singularity Containers: Evaluating Overheads for HPX/Kokkos using an astrophysics application
Patrick Diehl, Steven R. Brandt, Gregor Daiß, Hartmut Kaiser
TL;DR
The paper addresses overheads introduced by containerization for HPX/Kokkos-based HPC applications, using Octo-Tiger as a realistic astrophysics workload. It proposes a workflow leveraging Spack and Singularity to build and run Octo-Tiger within containers, and evaluates on homogeneous (Fugaku) and heterogeneous (DeepBayou) systems. Key findings show that container overheads are platform-dependent: on Fugaku regular runs are about 50 seconds faster than container runs at single-node, while on CPU-only DeepBayou runs are comparable and GPU container runs can even outperform non-container ones, though distributed container runs may crash. The work highlights reproducibility benefits of containers, along with challenges in building container images for non-native architectures, and outlines future work toward integrating Fusitju MPI in containers and scaling to larger systems like Perlmutter.
Abstract
Cloud computing for high performance computing resources is an emerging topic. This service is of interest to researchers who care about reproducible computing, for software packages with complex installations, and for companies or researchers who need the compute resources only occasionally or do not want to run and maintain a supercomputer on their own. The connection between HPC and containers is exemplified by the fact that Microsoft Azure's Eagle cloud service machine is number three on the November 23 Top 500 list. For cloud services, the HPC application and dependencies are installed in containers, e.g. Docker, Singularity, or something else, and these containers are executed on the physical hardware. Although containerization leverages the existing Linux kernel and should not impose overheads on the computation, there is the possibility that machine-specific optimizations might be lost, particularly machine-specific installs of commonly used packages. In this paper, we will use an astrophysics application using HPX-Kokkos and measure overheads on homogeneous resources, e.g. Supercomputer Fugaku, using CPUs only and on heterogenous resources, e.g. LSU's hybrid CPU and GPU system. We will report on challenges in compiling, running, and using the containers as well as performance performance differences.
