Energy Concerns with HPC Systems and Applications
Roblex Nana, Claude Tadonki, Petr Dokladal, Youssef Mesri
TL;DR
This survey analyzes energy concerns across HPC, embedded systems, and AI workloads, framing the problem as a trade-off between time-to-solution and energy-to-solution while highlighting carbon-footprint implications. It surveys energy metrics, cooling strategies, accelerator and processor platforms, and a wide range of energy-management tools, culminating in a taxonomy of static, dynamic, and hybrid optimization approaches. The AI-specific sections review carbon emissions, energy profiling tools (estimate- and measurement-based), and optimization techniques such as quantization, pruning, NAS, and hardware selection for training. The study underscores the practical impact of energy-aware design for cost, reliability, and environmental sustainability, and calls for integrated, cross-layer strategies to manage energy across HPC, embedded, and AI ecosystems, with attention to $CO_{2}$-related metrics and sustainable data-center practices.
Abstract
For various reasons including those related to climate changes, {\em energy} has become a critical concern in all relevant activities and technical designs. For the specific case of computer activities, the problem is exacerbated with the emergence and pervasiveness of the so called {\em intelligent devices}. From the application side, we point out the special topic of {\em Artificial Intelligence}, who clearly needs an efficient computing support in order to succeed in its purpose of being a {\em ubiquitous assistant}. There are mainly two contexts where {\em energy} is one of the top priority concerns: {\em embedded computing} and {\em supercomputing}. For the former, power consumption is critical because the amount of energy that is available for the devices is limited. For the latter, the heat dissipated is a serious source of failure and the financial cost related to energy is likely to be a significant part of the maintenance budget. On a single computer, the problem is commonly considered through the electrical power consumption. This paper, written in the form of a survey, we depict the landscape of energy concerns in computer activities, both from the hardware and the software standpoints.
