A Robust Power Model Training Framework for Cloud Native Runtime Energy Metric Exporter
Sunyanan Choochotkaew, Chen Wang, Huamin Chen, Tatsuhiro Chiba, Marcelo Amaral, Eun Kyung Lee, Tamar Eilam
TL;DR
Problem:Estimating per-container power in multi-tenant clouds without platform access or online measurements. Approach: a cloud-native training pipeline integrated with Kepler trains a system power model $M_{sys}$ on aggregated usage $U$ to predict $P$, then isolates background power via $P_{U-x}$ and computes workload power $\,\Delta P_x = P - P_{U-x}$, selecting the best candidate by isolation goodness $\rho$ with threshold $\rho_{th}$ before training a per-container model $M$ on labels $\Delta P$. Contributions: dynamic background power isolation, a formal isolation goodness metric, cross-workload/cross-platform validation with online training, and the ability to train without platform data. Significance: enables non-RAPL container power estimation for unseen containers on unknown platforms, supporting energy-aware cloud management and carbon accounting.
Abstract
Estimating power consumption in modern Cloud environments is essential for carbon quantification toward green computing. Specifically, it is important to properly account for the power consumed by each of the running applications, which are packaged as containers. This paper examines multiple challenges associated with this goal. The first challenge is that multiple customers are sharing the same hardware platform (multi-tenancy), where information on the physical servers is mostly obscured. The second challenge is the overhead in power consumption that the Cloud platform control plane induces. This paper addresses these challenges and introduces a novel pipeline framework for power model training. This allows versatile power consumption approximation of individual containers on the basis of available performance counters and other metrics. The proposed model utilizes machine learning techniques to predict the power consumed by the control plane and associated processes, and uses it for isolating the power consumed by the user containers, from the server power consumption. To determine how well the prediction results in an isolation, we introduce a metric termed isolation goodness. Applying the proposed power model does not require online power measurements, nor does it need information on the physical servers, configuration, or information on other tenants sharing the same machine. The results of cross-workload, cross-platform experiments demonstrated the higher accuracy of the proposed model when predicting power consumption of unseen containers on unknown platforms, including on virtual machines.
