Table of Contents
Fetching ...

ArchPower: Dataset for Architecture-Level Power Modeling of Modern CPU Design

Qijun Zhang, Yao Lu, Mengming Li, Shang Liu, Zhiyao Xie

TL;DR

ArchPower tackles the lack of open data for ML-based architecture-level power modeling by providing the first open dataset of 200 samples across 25 CPU configurations and 8 workloads, with a rich 101-feature architectural vector and 60 ground-truth power labels (including per-component and four power groups). The dataset is produced via a realistic RTL-to-power workflow that includes clock-gating and SRAM macros, and is complemented by a training-testing framework to evaluate generalization under different data distributions. The authors benchmark six power models (analytical and ML-based) and show ML-based approaches generally improve total and per-component power prediction, though cross-workload generalization remains an area for improvement. By enabling reproducibility and broader experimentation, ArchPower aims to accelerate ML-driven hardware power optimization and design-space exploration across architectures."

Abstract

Power is the primary design objective of large-scale integrated circuits (ICs), especially for complex modern processors (i.e., CPUs). Accurate CPU power evaluation requires designers to go through the whole time-consuming IC implementation process, easily taking months. At the early design stage (e.g., architecture-level), classical power models are notoriously inaccurate. Recently, ML-based architecture-level power models have been proposed to boost accuracy, but the data availability is a severe challenge. Currently, there is no open-source dataset for this important ML application. A typical dataset generation process involves correct CPU design implementation and repetitive execution of power simulation flows, requiring significant design expertise, engineering effort, and execution time. Even private in-house datasets often fail to reflect realistic CPU design scenarios. In this work, we propose ArchPower, the first open-source dataset for architecture-level processor power modeling. We go through complex and realistic design flows to collect the CPU architectural information as features and the ground-truth simulated power as labels. Our dataset includes 200 CPU data samples, collected from 25 different CPU configurations when executing 8 different workloads. There are more than 100 architectural features in each data sample, including both hardware and event parameters. The label of each sample provides fine-grained power information, including the total design power and the power for each of the 11 components. Each power value is further decomposed into four fine-grained power groups: combinational logic power, sequential logic power, memory power, and clock power. ArchPower is available at https://github.com/hkust-zhiyao/ArchPower.

ArchPower: Dataset for Architecture-Level Power Modeling of Modern CPU Design

TL;DR

ArchPower tackles the lack of open data for ML-based architecture-level power modeling by providing the first open dataset of 200 samples across 25 CPU configurations and 8 workloads, with a rich 101-feature architectural vector and 60 ground-truth power labels (including per-component and four power groups). The dataset is produced via a realistic RTL-to-power workflow that includes clock-gating and SRAM macros, and is complemented by a training-testing framework to evaluate generalization under different data distributions. The authors benchmark six power models (analytical and ML-based) and show ML-based approaches generally improve total and per-component power prediction, though cross-workload generalization remains an area for improvement. By enabling reproducibility and broader experimentation, ArchPower aims to accelerate ML-driven hardware power optimization and design-space exploration across architectures."

Abstract

Power is the primary design objective of large-scale integrated circuits (ICs), especially for complex modern processors (i.e., CPUs). Accurate CPU power evaluation requires designers to go through the whole time-consuming IC implementation process, easily taking months. At the early design stage (e.g., architecture-level), classical power models are notoriously inaccurate. Recently, ML-based architecture-level power models have been proposed to boost accuracy, but the data availability is a severe challenge. Currently, there is no open-source dataset for this important ML application. A typical dataset generation process involves correct CPU design implementation and repetitive execution of power simulation flows, requiring significant design expertise, engineering effort, and execution time. Even private in-house datasets often fail to reflect realistic CPU design scenarios. In this work, we propose ArchPower, the first open-source dataset for architecture-level processor power modeling. We go through complex and realistic design flows to collect the CPU architectural information as features and the ground-truth simulated power as labels. Our dataset includes 200 CPU data samples, collected from 25 different CPU configurations when executing 8 different workloads. There are more than 100 architectural features in each data sample, including both hardware and event parameters. The label of each sample provides fine-grained power information, including the total design power and the power for each of the 11 components. Each power value is further decomposed into four fine-grained power groups: combinational logic power, sequential logic power, memory power, and clock power. ArchPower is available at https://github.com/hkust-zhiyao/ArchPower.

Paper Structure

This paper contains 29 sections, 4 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: Comparison between (a) standard power evaluation flow and (b) architecture-level power evaluation flow. The architecture-level power modeling flow is significantly efficient compared with the standard power evaluation flow. ArchPower provides labeled data for ML-based architecture-level power modeling.
  • Figure 2: (a) The architecture of the modern high-performance CPU core. Blue blocks are major components. The yellow block represents the Other Logic. (b) A layout example of one BOOM CPU.
  • Figure 3: Dataset and data generation process of ArchPower. Our dataset mainly includes the architectural power modeling features and power labels. The features can further be masked with component masks to generate per-component features. ArchPower generates architectural power modeling features through the architectural event collection and generates power labels through design implementation and ground-truth power collection.
  • Figure 4: The power distributions across 11 components of 6 different CPU configurations (B1, B7, B15, X1, X5, X10) with different scales.
  • Figure 5: Predictions with different models on BOOM CPU under Balance training scenario.
  • ...and 5 more figures