The Shamrock code: I- Smoothed Particle Hydrodynamics on GPUs
Timothée David--Cléris, Guillaume Laibe, Yona Lapeyre
TL;DR
Shamrock introduces a performance-portable, SYCL-based framework for Smoothed Particle Hydrodynamics on Exascale architectures, centering on a fully parallel binary tree for on-the-fly neighbor searches and ghost-zone management. The SPH solver mirrors Phantom but replaces KD-trees with a radix/tree approach, enabling negligible tree construction time and scalable multi-GPU execution. Extensive standard hydrodynamics tests and circumbinary disc simulations demonstrate accuracy on par with Phantom while achieving high throughput; single-GPU throughput reaches tens of millions of particles per second and multi-GPU weak scaling approaches $92\%$ efficiency on thousands of GPUs. The results establish Shamrock as a promising platform for high-resolution astrophysical simulations and a flexible foundation for integrating gravity and multi-physics methods, with strong potential for further performance optimizations and broader applicability across grid- and particle-based solvers.
Abstract
We present Shamrock, a performance portable framework developed in C++17 with the SYCL programming standard, tailored for numerical astrophysics on Exascale architectures. The core of Shamrock is an accelerated parallel tree with negligible construction time, whose efficiency is based on binary algebra. The Smoothed Particle Hydrodynamics algorithm of the Phantom code is implemented in Shamrock. On-the-fly tree construction circumvents the necessity for extensive data communications. In tests displaying a uniform density with global timesteping with tens of billions of particles, Shamrock completes a single time step in a few seconds using over the thousand of GPUs of a super-computer. This corresponds to processing billions of particles per second, with tens of millions of particles per GPU. The parallel efficiency across the entire cluster is larger than $\sim 90\%$.
