Adaptive Machine Learning for Resource-Constrained Environments
Sebastián A. Cajas Ordóñez, Jaydeep Samanta, Andrés L. Suárez-Cetrulo, Ricardo Simón Carbajo
TL;DR
The paper tackles CPU utilization prediction for ad-hoc offloading in IoT gateways under non-stationary data streams. It compares offline state-of-the-art models, online incremental learners, and a time-series foundation model Lag-Llama using a CPU utilization dataset collected at 1-minute intervals, evaluated with multiple error and resource metrics. Findings show ensemble methods (notably XGBoost and Random Forest) deliver strong accuracy with low footprints, online adaptive methods offer favorable trade-offs for continuous learning, and Lag-Llama generally underperforms in this edge-focused setup. The work provides guidance for model selection in constrained IoT environments and contributes a valuable data-stream benchmark for CPU performance estimation.
Abstract
The Internet of Things is an example domain where data is perpetually generated in ever-increasing quantities, reflecting the proliferation of connected devices and the formation of continuous data streams over time. Consequently, the demand for ad-hoc, cost-effective machine learning solutions must adapt to this evolving data influx. This study tackles the task of offloading in small gateways, exacerbated by their dynamic availability over time. An approach leveraging CPU utilization metrics using online and continual machine learning techniques is proposed to predict gateway availability. These methods are compared to popular machine learning algorithms and a recent time-series foundation model, Lag-Llama, for fine-tuned and zero-shot setups. Their performance is benchmarked on a dataset of CPU utilization measurements over time from an IoT gateway and focuses on model metrics such as prediction errors, training and inference times, and memory consumption. Our primary objective is to study new efficient ways to predict CPU performance in IoT environments. Across various scenarios, our findings highlight that ensemble and online methods offer promising results for this task in terms of accuracy while maintaining a low resource footprint.
