Table of Contents
Fetching ...

Dissecting the software-based measurement of CPU energy consumption: a comparative analysis

Guillaume Raffin, Denis Trystram

TL;DR

The paper addresses the challenge of accurately measuring CPU energy consumption using RAPL by dissecting the four practical mechanisms (MSR, powercap, perf-events, and perf-events via eBPF) and providing a reference Rust tool. It reconstructs a deeper understanding of RAPL domains, counter overflows, and timing control, then empirically compares mechanisms on Intel and AMD hardware with NAS benchmarks to assess overhead and idle-power impact. The authors demonstrate that no single mechanism dominates in performance, but perf-events generally offers favorable latency, robustness, and ease-of-use, while higher-frequency monitoring should be dynamically tuned to workload. The study delivers concrete recommendations for building correct, resilient, and lightweight energy-measurement tools, along with a publicly released reference implementation to guide the community.

Abstract

Every day, we experience the effects of the global warming: extreme weather events, major forest fires, storms, global warming, etc.The scientific community acknowledges that this crisis is a consequence of human activities where Information and Communications Technologies (ICT) are an increasingly important contributor.Computer scientists need tools for measuring the footprint of the code they produce and for optimizing it. Running Average Power Limit (RAPL) is a low-level interface designed by Intel that provides a measure of the energy consumption of a CPU (and more) without the need for additional hardware. Since 2017, it is available on most computing devices, including non-Intel devices such as AMD processors.More and more people are using RAPL for energy measurement, mostly like a black box without deep knowledge of its behavior.Unfortunately, this causes mistakes when implementing measurement tools.In this paper, we propose to come back to the basic mechanisms that allow to use RAPL measurements and present a critical analysis of their operations. In addition to long-established mechanisms, we explore the suitability of the recent eBPF technology (formerly and abbreviation for extended Berkeley Packet Filter) for working with RAPL.For each mechanism, we release an implementation in Rust that avoids the pitfalls we detected in existing tools, improving correctness, timing accuracy and performance. These new implementations have desirable properties for monitoring and profiling parallel applications.We also provide an experimental study with multiple benchmarks and processor models (Intel and AMD) in order to evaluate the efficiency of the various mechanisms and their impact on parallel software.These experiments show that no mechanism provides a significant performance advantage over the others. However, they differ significantly in terms of ease-of-use and resiliency.We believe that this work will help the community to develop correct, resilient and lightweight measurement tools.

Dissecting the software-based measurement of CPU energy consumption: a comparative analysis

TL;DR

The paper addresses the challenge of accurately measuring CPU energy consumption using RAPL by dissecting the four practical mechanisms (MSR, powercap, perf-events, and perf-events via eBPF) and providing a reference Rust tool. It reconstructs a deeper understanding of RAPL domains, counter overflows, and timing control, then empirically compares mechanisms on Intel and AMD hardware with NAS benchmarks to assess overhead and idle-power impact. The authors demonstrate that no single mechanism dominates in performance, but perf-events generally offers favorable latency, robustness, and ease-of-use, while higher-frequency monitoring should be dynamically tuned to workload. The study delivers concrete recommendations for building correct, resilient, and lightweight energy-measurement tools, along with a publicly released reference implementation to guide the community.

Abstract

Every day, we experience the effects of the global warming: extreme weather events, major forest fires, storms, global warming, etc.The scientific community acknowledges that this crisis is a consequence of human activities where Information and Communications Technologies (ICT) are an increasingly important contributor.Computer scientists need tools for measuring the footprint of the code they produce and for optimizing it. Running Average Power Limit (RAPL) is a low-level interface designed by Intel that provides a measure of the energy consumption of a CPU (and more) without the need for additional hardware. Since 2017, it is available on most computing devices, including non-Intel devices such as AMD processors.More and more people are using RAPL for energy measurement, mostly like a black box without deep knowledge of its behavior.Unfortunately, this causes mistakes when implementing measurement tools.In this paper, we propose to come back to the basic mechanisms that allow to use RAPL measurements and present a critical analysis of their operations. In addition to long-established mechanisms, we explore the suitability of the recent eBPF technology (formerly and abbreviation for extended Berkeley Packet Filter) for working with RAPL.For each mechanism, we release an implementation in Rust that avoids the pitfalls we detected in existing tools, improving correctness, timing accuracy and performance. These new implementations have desirable properties for monitoring and profiling parallel applications.We also provide an experimental study with multiple benchmarks and processor models (Intel and AMD) in order to evaluate the efficiency of the various mechanisms and their impact on parallel software.These experiments show that no mechanism provides a significant performance advantage over the others. However, they differ significantly in terms of ease-of-use and resiliency.We believe that this work will help the community to develop correct, resilient and lightweight measurement tools.
Paper Structure (36 sections, 1 equation, 11 figures, 7 tables)

This paper contains 36 sections, 1 equation, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Hierarchy of the possible RAPL domains and their corresponding hardware components. Domain names are in italic, and grayed items do not form a domain on their own. Items marked with an asterisk are not present on servers.
  • Figure 2: Architecture of our measurement tool
  • Figure 3: Plot of the actual measurement frequency achieved by the tools for each target frequency supplied as a command-line argument (on a Lenovo Thinkpad L15 Gen1 laptop)
  • Figure 4: Measurement mechanism based on perf-events and eBPF
  • Figure 5: Benchmark repetitions spread over time
  • ...and 6 more figures