Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning
Maad Ebrahim, Abdelhakim Hafid
TL;DR
The paper addresses real-time IoT workloads by optimizing load balancing in Fog networks using fully distributed multi-agent reinforcement learning (MARL). Independent agents deployed at IoT APs learn local load-distribution policies over regional candidate Fog nodes, aided by lifelong transfer learning and interval-based Gossip observations to reflect realistic communication timing without inter-agent coordination. Key contributions include a scalable fully distributed MARL framework, region-based decomposition to reduce state/action complexity, and a realism-versus-performance analysis of interval-based observations. Results show that independently trained agents achieve faster convergence and lower waiting delays with fair resource utilization, while acknowledging a practical trade-off when observations are not real-time; the approach is positioned as deployment-ready for global-scale Fog environments.
Abstract
Real-time Internet of Things (IoT) applications require real-time support to handle the ever-growing demand for computing resources to process IoT workloads. Fog Computing provides high availability of such resources in a distributed manner. However, these resources must be efficiently managed to distribute unpredictable traffic demands among heterogeneous Fog resources. This paper proposes a fully distributed load-balancing solution with Multi-Agent Reinforcement Learning (MARL) that intelligently distributes IoT workloads to optimize the waiting time while providing fair resource utilization in the Fog network. These agents use transfer learning for life-long self-adaptation to dynamic changes in the environment. By leveraging distributed decision-making, MARL agents effectively minimize the waiting time compared to a single centralized agent solution and other baselines, enhancing end-to-end execution delay. Besides performance gain, a fully distributed solution allows for a global-scale implementation where agents can work independently in small collaboration regions, leveraging nearby local resources. Furthermore, we analyze the impact of a realistic frequency to observe the state of the environment, unlike the unrealistic common assumption in the literature of having observations readily available in real-time for every required action. The findings highlight the trade-off between realism and performance using an interval-based Gossip-based multi-casting protocol against assuming real-time observation availability for every generated workload.
