Green computing toward SKA era with RICK
Giovanni Lacopo, Claudio Gheller, Emanuele De Rubeis, Pascal Jahan Elahi, Maciej Cytowski, Luca Tornatore, Giuliano Taffoni, Ugo Varetto
TL;DR
The paper tackles SKA-scale data processing by introducing RICK, a w-stacking imaging code designed for energy-aware HPC on heterogeneous architectures. It evaluates three parallelization strategies—MPI, hybrid MPI/OpenMP, and GPU acceleration—and introduces green productivity, $GP = \frac{T_0/T_N}{\alpha E_N/E_0}$ with $\alpha=1$, to jointly assess time-to-solution and energy-to-solution. Through LOFAR data tests on Setonix hardware, GPU-accelerated configurations deliver substantial improvements in both speed and energy efficiency at scale, while single-node scenarios may favor hybrid approaches due to I/O and data-transfer considerations. The work highlights the importance of energy-aware, GPU-accelerated imaging for SKA-era pipelines and points to distributed FFTs for AMD GPUs as a path to further performance and energy gains. Overall, RICK demonstrates how careful hardware-software co-design can achieve sustainable, high-throughput radio-imaging workloads on next-generation HPC systems.
Abstract
Square Kilometer Array is expected to generate hundreds of petabytes of data per year, two orders of magnitude more than current radio interferometers. Data processing at this scale necessitates advanced High Performance Computing (HPC) resources. However, modern HPC platforms consume up to tens of M W , i.e. megawatts, and energy-to-solution in algorithms will become of utmost importance in the next future. In this work we study the trade-off between energy-to-solution and time-to-solution of our RICK code (Radio Imaging Code Kernels), which is a novel approach to implement the w-stacking algorithm designed to run on state-of-the-art HPC systems. The code can run on heterogeneous systems exploiting the accelerators. We did both single-node tests and multi-node tests with both CPU and GPU solutions, in order to study which one is the greenest and which one is the fastest. We then defined the green productivity, i.e. a quantity which relates energy-to-solution and time-to-solution in different code configurations compared to a reference one. Configurations with the highest green productivities are the most efficient ones. The tests have been run on the Setonix machine available at the Pawsey Supercomputing Research Centre (PSC) in Perth (WA), ranked as 28th in Top500 list, updated at June 2024.
