LEDDS: Portable LBM-DEM simulations on GPUs
Raphael Maggio-Aprile, Maxime Rambosson, Christophe Coreixas, Jonas Latt
TL;DR
LEDDS tackles the challenge of portable, high-performance CFD-DEM simulations on GPUs by expressing all computations as a small set of algorithmic primitives, enabling device-agnostic yet efficient execution. The framework couples LBM and DEM through a partially saturated cell approach (PSC) and implements both spherical and ellipsoidal particles on a uniform grid, with DEM and fluid steps unified under map, sort, and reduce primitives. Validation spans DEM-only and LBM-DEM benchmarks, showing energy and momentum conservation, correct Jeffery rotations for ellipsoids, and accurate rheology predictions, while performance analyses demonstrate strong GPU speedups and competitive parity with optimized CUDA-based solvers. Collectively, LEDDS provides a blueprint for portable, readable, and scalable multiphysics software on heterogeneous architectures, with clear paths toward distributed multi-GPU implementations and more complex particle geometries.
Abstract
Algorithmic formulations of GPU programs provide a high-level alternative to device-specific code by expressing computations as compositions of well-defined parallel primitives (e.g., map, sort, reduce), rather than through handcrafted GPU kernels. In this work, we demonstrate that this paradigm can be extended to complex and challenging problems in computational physics: the simulation of granular flows and fluid-particle interactions. LEDDS, our open-source framework, performs fully coupled Lattice Boltzmann -- Discrete Element Method (LBM-DEM) simulations using only algorithmic primitives, and runs efficiently on single-GPU platforms. The entire workflow, including neighbor search, collision detection, and fluid-particle coupling, is expressed as a sequence of portable primitives. While the current implementation illustrates these principles primarily through algorithms from the C++ Standard Library, with selective use of Thrust primitives for performance, the underlying concept is compatible with any HPC environment offering a rich set of parallel algorithms and is therefore applicable across a wide range of modern GPU systems and future accelerators. LEDDS is validated through benchmarks spanning both DEM and LBM-DEM configurations, including sphere and ellipsoid collisions, wall friction tests, single-particle settling, Jeffery's orbits, and particle-laden shear flows. Despite its high level of abstraction, LEDDS achieves performances comparable to those of hand-tuned CUDA solvers, while maintaining portability and code clarity. These results show that high-performance LBM-DEM coupling can be achieved without sacrificing generality or readability, establishing LEDDS as a blueprint for portable multiphysics frameworks based on algorithmic primitives.
