On the Challenges of Energy-Efficiency Analysis in HPC Systems: Evaluating Synthetic Benchmarks and Gromacs
Rafael Ravedutti Lucio Machado, Jan Eitzinger, Georg Hager, Gerhard Wellein
TL;DR
The paper analyzes the challenges of assessing energy efficiency in HPC by comparing synthetic benchmarks with the Gromacs molecular dynamics package on heterogeneous Fritz and Alex clusters. It leverages MD-Bench, STREAM, GEMM, and Gromacs across CPU and GPU environments to study how frequency, power caps, and affinity influence energy-to-solution and energy-delay products, while highlighting measurement overheads and profiling pitfalls. Key contributions include a detailed account of instrumentation overhead, the limitations of hardware performance counters, and best-practice recommendations for rigorous benchmarking in energy-focused HPC studies. The findings emphasize the need for end-to-end measurements, careful affinity control, and cross-validation across tools to produce reliable, generalizable insights, and call for standardized, open interfaces for power measurement across platforms.
Abstract
This paper discusses the challenges encountered when analyzing the energy efficiency of synthetic benchmarks and the Gromacs package on the Fritz and Alex HPC clusters. Experiments were conducted using MPI parallelism on full sockets of Intel Ice Lake and Sapphire Rapids CPUs, as well as Nvidia A40 and A100 GPUs. The metrics and measurements obtained with the Likwid and Nvidia profiling tools are presented, along with the results. The challenges and pitfalls encountered during experimentation and analysis are revealed and discussed. Best practices for future energy efficiency analysis studies are suggested.
