Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

Ahmad Maroof Karimi; Matthias Maiterth; Woong Shin; Naw Safrin Sattar; Hao Lu; Feiyi Wang

Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

Ahmad Maroof Karimi, Matthias Maiterth, Woong Shin, Naw Safrin Sattar, Hao Lu, Feiyi Wang

TL;DR

The paper tackles the challenge of software-driven energy efficiency in exascale HPC by combining GPU benchmarks with three months of Frontier telemetry to project system-scale energy savings under dynamic voltage/frequency scaling and power capping. Its hybrid methodology fuses power telemetry, benchmark-based power characterization, and modal decomposition to map real workloads to distinct power modes and derive upper-bound energy savings for large-scale systems. Key contributions include a data-driven framework for estimating best-case savings, a roofline- and memory-benchmarked characterization of MI250X GPUs, and system-wide projections showing potential savings on Frontier of roughly 8% without slowing down or up to nearly 9% with modest runtime penalties, particularly for large jobs. The approach provides a principled basis for optimizing power-performance trade-offs in the exascale era and beyond, with practical implications for HPC centers seeking to maximize efficiency under tight power budgets.

Abstract

In the face of surging power demands for exascale HPC systems, this work tackles the critical challenge of understanding the impact of software-driven power management techniques like Dynamic Voltage and Frequency Scaling (DVFS) and Power Capping. These techniques have been actively developed over the past few decades. By combining insights from GPU benchmarking to understand application power profiles, we present a telemetry data-driven approach for deriving energy savings projections. This approach has been demonstrably applied to the Frontier supercomputer at scale. Our findings based on three months of telemetry data indicate that, for certain resource-constrained jobs, significant energy savings (up to 8.5%) can be achieved without compromising performance. This translates to a substantial cost reduction, equivalent to 1438 MWh of energy saved. The key contribution of this work lies in the methodology for establishing an upper limit for these best-case scenarios and its successful application. This work sheds light on potential energy savings and empowers HPC professionals to optimize the power-performance trade-off within constrained power budgets, not only for the exascale era but also beyond.

Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

TL;DR

Abstract

Paper Structure (25 sections, 10 figures, 7 tables, 1 algorithm)

This paper contains 25 sections, 10 figures, 7 tables, 1 algorithm.

Introduction
Background and Related Work
Power Implications of Heavy/Fat CPU+GPU Heterogeneous Node Architecture Towards Exascale
Energy saving opportunities using dynamic power management on contemporary hardware
Dynamic power management in the HPC data center
A Hybrid Methodology
Utilizing Power Telemetry Data
Telemetry Data Description
Job-scheduler Log Description
Characterizing Power Consumption using Benchmarks
Roofline and Benchmark
Benchmark for GPU Memory Characterization
Verification of Memory Characteristics using a Real HPC Graph Application
Data-driven Analysis for Modal Decomposition
Mapping the observed benchmark behavior to the identified modes for the HPC applications on the full system scale
...and 10 more sections

Figures (10)

Figure 1: Schematic representation of Frontier compute node and MI250X multi-chip GPU
Figure 2: Plot (a) compares telemetry data and ROCm SMI data for a sample Frontier application run. The histogram in the plot (b) shows the histogram of GPU and CPU energy utilization on the Frontier system.
Figure 3: GPU benches L2-cache memory access pattern
Figure 4: Roofline Plot showings: Left: Fixed Frequency, Right: Power Cap. Top to bottom: a) TFLOPS/s, b) GByte/s, c) Power Consumption, d) normalized time to solution, with different power limits for a single GPU, while running all tiles of an MI250X. The x-axis is the arithmetic intensity by operations per byte.
Figure 5: VAI Plot: Left fixed frequency (700MHz -- 1700MHz, Right Power Cap (100W-560W). For each we show: Runtime (top), power used (mid), and energy to solution (bottom). The values are normalized to 1.0, representing the uncapped case at 1700MHz/560W respectively. Each line is a specific arithmetic intensity from 1/16 to 1024 in powers of two. The arithmetic intensity of 0 is a stream copy call.
...and 5 more figures

Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

TL;DR

Abstract

Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

Authors

TL;DR

Abstract

Table of Contents

Figures (10)