A GPU-ready pseudo-spectral method for direct numerical simulations of multiphase turbulence
Alessio Roccon
TL;DR
This work addresses the computational challenge of directly simulating interface-resolved multiphase turbulence by porting a pseudo-spectral DNS solver to GPUs. The authors combine a Navier–Stokes solver with a phase-field (Cahn–Hilliard) model using a two-tier parallelization: MPI-based 2D pencil domain decomposition and OpenACC offloading with cuFFT for accelerators, facilitated by CUDA unified memory and CUDA-aware MPI. Key contributions include a portable, GPU-ready implementation with batched transforms, kernel-based nonlinear term evaluation, and efficient wall-normal solves, validated by strong scaling on HPC systems and a large-scale demo (2048×1024×1025 grid with 256 droplets). The approach enables high-fidelity multiphase turbulence simulations on heterogeneous hardware and sets the stage for extending to non-NVIDIA architectures (e.g., ROCm) in future work.
Abstract
In this work, we detail the GPU-porting of an in-house pseudo-spectral solver tailored towards large-scale simulations of interface-resolved simulation of drop- and bubble-laden turbulent flows. The code relies on direct numerical simulation of the Navier-Stokes equations, used to describe the flow field, coupled with a phase-field method, used to describe the shape, deformation, and topological changes of the interface of the drops or bubbles. The governing equations -Navier-Stokes and Cahn-Hilliard equations-are solved using a pseudo-spectral method that relies on transforming the variables in the wavenumber space. The code targets large-scale simulations of drop- and bubble-laden turbulent flows and relies on a multilevel parallelism. The first level of parallelism relies on the message-passing interface (MPI) and is used on multi-core architectures in CPU-based infrastructures. A second level of parallelism relies on OpenACC directives and cuFFT libraries and is used to accelerate the code execution when GPU-based infrastructures are targeted. The resulting multiphase flow solver can be efficiently executed in heterogeneous computing infrastructures and exhibits a remarkable speed-up when GPUs are employed. Thanks to the modular structure of the code and the use of a directive-based strategy to offload code execution on GPUs, only minor code modifications are required when targeting different computing architectures. This improves code maintenance, version control and the implementation of additional modules or governing equations.
