Table of Contents
Fetching ...

A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter

Sunyanan Choochotkaew, Chen Wang, Huamin Chen, Tatsuhiro Chiba, Marcelo Amaral, Eun Kyung Lee, Tamar Eilam

TL;DR

Problem:Estimating per-container power in multi-tenant clouds without platform access or online measurements. Approach: a cloud-native training pipeline integrated with Kepler trains a system power model $M_{sys}$ on aggregated usage $U$ to predict $P$, then isolates background power via $P_{U-x}$ and computes workload power $\,\Delta P_x = P - P_{U-x}$, selecting the best candidate by isolation goodness $\rho$ with threshold $\rho_{th}$ before training a per-container model $M$ on labels $\Delta P$. Contributions: dynamic background power isolation, a formal isolation goodness metric, cross-workload/cross-platform validation with online training, and the ability to train without platform data. Significance: enables non-RAPL container power estimation for unseen containers on unknown platforms, supporting energy-aware cloud management and carbon accounting.

Abstract

Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform (multi-tenancy), where information on the physical servers is mostly obscured. The second challenge is the overhead in power consumption that the Cloud platform control plane induces. This paper addresses these challenges and introduces a novel pipeline framework for power model training. This allows versatile power consumption approximation of individual containers on the basis of available performance counters and other metrics. The proposed model utilizes machine learning techniques to predict the power consumed by the control plane and associated processes, and uses it for isolating the power consumed by the user containers, from the server power consumption. To determine how well the prediction results in an isolation, we introduce a metric termed isolation goodness. Applying the proposed power model does not require online power measurements, nor does it need information on the physical servers, configuration, or information on other tenants sharing the same machine. The results of cross-workload, cross-platform experiments demonstrated the higher accuracy of the proposed model when predicting power consumption of unseen containers on unknown platforms, including on virtual machines.

A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter

TL;DR

Problem:Estimating per-container power in multi-tenant clouds without platform access or online measurements. Approach: a cloud-native training pipeline integrated with Kepler trains a system power model on aggregated usage to predict , then isolates background power via and computes workload power , selecting the best candidate by isolation goodness with threshold before training a per-container model on labels . Contributions: dynamic background power isolation, a formal isolation goodness metric, cross-workload/cross-platform validation with online training, and the ability to train without platform data. Significance: enables non-RAPL container power estimation for unseen containers on unknown platforms, supporting energy-aware cloud management and carbon accounting.

Abstract

Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform (multi-tenancy), where information on the physical servers is mostly obscured. The second challenge is the overhead in power consumption that the Cloud platform control plane induces. This paper addresses these challenges and introduces a novel pipeline framework for power model training. This allows versatile power consumption approximation of individual containers on the basis of available performance counters and other metrics. The proposed model utilizes machine learning techniques to predict the power consumed by the control plane and associated processes, and uses it for isolating the power consumed by the user containers, from the server power consumption. To determine how well the prediction results in an isolation, we introduce a metric termed isolation goodness. Applying the proposed power model does not require online power measurements, nor does it need information on the physical servers, configuration, or information on other tenants sharing the same machine. The results of cross-workload, cross-platform experiments demonstrated the higher accuracy of the proposed model when predicting power consumption of unseen containers on unknown platforms, including on virtual machines.
Paper Structure (22 sections, 12 equations, 13 figures, 2 tables, 1 algorithm)

This paper contains 22 sections, 12 equations, 13 figures, 2 tables, 1 algorithm.

Figures (13)

  • Figure 1: (a) Workload usage is not always correlated to (b) power consumption due to noisy background processes.
  • Figure 2: Dynamic power isolation for model training.
  • Figure 3: Snapshot of normalized Kepler metrics showing high correlation between resource usage and power consumption when running Coremark benchmark.
  • Figure 4: Correlation between resource usage from different metric producers and RAPL power for each benchmark.
  • Figure 5: Non-RAPL power modeling.
  • ...and 8 more figures