QoS-Aware Load Balancing in the Computing Continuum via Multi-Player Bandits
Ivan Čilić, Ivana Podnar Žarko, Pantelis Frangoudis, Schahram Dustdar
TL;DR
The paper tackles per-client QoS guarantees in the dynamic Computing Continuum by formulating decentralized load balancing as a Multi-Player Multi-Armed Bandit problem with heterogeneous rewards. It introduces QEdgeProxy, which maintains a KDE-based QoS pool per load balancer and uses adaptive epsilon exploration with Smooth Weighted Round Robin routing to balance exploitation and exploration without central coordination. The approach yields sublinear regret in non-stationary conditions and demonstrates superior per-client QoS satisfaction (95–100%), fair load distribution, and resilience to workload and instance changes in a Kubernetes-native implementation. The results indicate that decentralized MP-MAB-based load balancing with KDE QoS estimation is effective and scalable for edge-to-cloud environments where global coordination is costly or impractical.
Abstract
As computation shifts from the cloud to the edge to reduce processing latency and network traffic, the resulting Computing Continuum (CC) creates a dynamic environment where meeting strict Quality of Service (QoS) requirements and avoiding service instance overload becomes challenging. Existing methods often prioritize global metrics and overlook per-client QoS, which is crucial for latency-sensitive and reliability-critical applications. We propose QEdgeProxy, a decentralized QoS-aware load balancer that acts as a proxy between IoT devices and service instances in the CC. We formulate the load balancing problem as a Multi-Player Multi-Armed Bandit (MP-MAB) with heterogeneous rewards: Each load balancer autonomously selects service instances to maximize the probability of meeting its clients' QoS requirements by using Kernel Density Estimation (KDE) to estimate QoS success probabilities. Our load-balancing algorithm also incorporates an adaptive exploration mechanism to recover rapidly from performance shifts and non-stationary conditions. We present a Kubernetes-native QEdgeProxy implementation and evaluate it on an emulated CC testbed deployed on a K3s cluster with realistic network conditions and a latency-sensitive edge-AI workload. Results show that QEdgeProxy significantly outperforms proximity-based and reinforcement-learning baselines in per-client QoS satisfaction, while adapting effectively to load surges and changes in instance availability.
