EdgeMLBalancer: A Self-Adaptive Approach for Dynamic Model Switching on Resource-Constrained Edge Devices
Akhila Matathammal, Kriti Gupta, Larissa Lavanya, Ananya Vishal Halgatti, Priyanshi Gupta, Karthik Vaidhyanathan
TL;DR
EdgeMLBalancer addresses the challenge of running real-time AI on resource-constrained edge devices by enabling self-adaptive, CPU-aware switching between multiple object-detection configurations. It integrates a MAPE-K feedback loop with an epsilon-greedy planner to select among models in real time, balancing accuracy and computational load. Empirical evaluation on real-world Indian traffic data demonstrates improved inference accuracy and resource efficiency, along with fair model usage and modest switching overhead. The work offers a practical, on-device adaptive framework for dynamic workload management in edge AI systems, with potential extensions to other devices and hybrid edge-cloud setups.
Abstract
The widespread adoption of machine learning on edge devices, such as mobile phones, laptops, IoT devices, etc., has enabled real-time AI applications in resource-constrained environments. Existing solutions for managing computational resources often focus narrowly on accuracy or energy efficiency, failing to adapt dynamically to varying workloads. Furthermore, the existing system lack robust mechanisms to adaptively balance CPU utilization, leading to inefficiencies in resource-constrained scenarios like real-time traffic monitoring. To address these limitations, we propose a self-adaptive approach that optimizes CPU utilization and resource management on edge devices. Our approach, EdgeMLBalancer balances between models through dynamic switching, guided by real-time CPU usage monitoring across processor cores. Tested on real-time traffic data, the approach adapts object detection models based on CPU usage, ensuring efficient resource utilization. The approach leverages epsilon-greedy strategy which promotes fairness and prevents resource starvation, maintaining system robustness. The results of our evaluation demonstrate significant improvements by balancing computational efficiency and accuracy, highlighting the approach's ability to adapt seamlessly to varying workloads. This work lays the groundwork for further advancements in self-adaptation for resource-constrained environments.
