A highly accurate drag solver for multi-fluid dust and gas hydrodynamics on GPUs
L. Sewanou, G. Laibe, B. Commerçon
TL;DR
This work introduces a GPU-optimized, high-precision drag solver for multi-fluid dust–gas hydrodynamics, enabling accurate friction calculations via a matrix-exponential treatment of the drag subsystem. The key idea is to solve $\partial \tilde{\bm{U}}/\partial t = \bm{\Omega}\tilde{\bm{U}}$ with $\tilde{\bm{U}}(t)=e^{t\bm{\Omega}}\tilde{\bm{U}}(t_0)$ and to compute the matrix exponential using a scaling-and-squaring approach built on Taylor/Pade expansions, achieving machine-precision accuracy while keeping the drag step efficient. The solver is implemented in two GPU codes, Dyablo (Kokkos) and Shamrock (Sycl), and validated with Dustybox, Dustywave, and Dustyshock tests, demonstrating stable performance with up to 16 dust species. Performance analyses on A100 GPUs show the drag step costs remain manageable relative to the hydro step, supporting future inclusion of growth and fragmentation and enabling broader applications to radiative transfer or chemistry. The approach and codes are publicly available, highlighting its practical impact for exascale astrophysical simulations.
Abstract
Exascale supercomputing unleashes the potential for simulations of astrophysical systems with unprecedented resolution. Taking full advantage of this computing power requires the development of new algorithms and numerical methods that are GPU friendly and scalable. In the context of multi-fluid dust-gas dynamics, we propose a highly accurate algorithm that is specifically designed for GPUs. We developed a multi-fluid gas-dust algorithm capable of computing friction terms on GPU architectures to machine precision, with the constraint for the drag-time step to remain a fraction of the global hydrodynamic time step for computational efficiency in practice. We present a scaling-and-squaring algorithm tailored to modern architectures for computing the exponential of the drag matrix, enabling high accuracy in friction calculations across relevant astrophysical regimes. The algorithm was validated through the Dustybox Dustywave and Dustyshock tests. The algorithm was implemented and tested in two multi-GPU codes with different architectures and GPU programming models: Dyablo, an adaptive mesh refinement code based on the Kokkos library, and Shamrock, a multi-method code based on Sycl. On current architectures, the friction computation remains acceptable for both codes (below the typical hydro time step) up to 16 species, enabling a further implementation of growth and fragmentation. This algorithm might be applied to other physical processes, such as radiative transfer or chemistry.
