Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)
Ziqun Chen, Kechao Cai, Jinbei Zhang, Zhigang Yu
TL;DR
This work tackles data collection in IoT edge networks where multiple servers must cooperatively learn sensor data rates under unknown channel conditions. It models the problem as a multiplayer multi-armed bandit with collisions and fairness constraints, and introduces the DC-ULCB algorithm that combines initialization, running consensus for rate estimation, and cyclic fair sensor assignment. The authors prove instance-dependent logarithmic upper bounds on reward and fairness regrets and demonstrate via simulations that DC-ULCB outperforms existing collision-aware and cooperative methods. The approach enables fair, efficient data gathering in distributed IoT systems with limited inter-server communication, improving both throughput and equity among servers.
Abstract
In intelligent Internet of Things (IoT) systems, edge servers within a network exchange information with their neighbors and collect data from sensors to complete delivered tasks. In this paper, we propose a multiplayer multi-armed bandit model for intelligent IoT systems to facilitate data collection and incorporate fairness considerations. In our model, we establish an effective communication protocol that helps servers cooperate with their neighbors. Then we design a distributed cooperative bandit algorithm, DC-ULCB, enabling servers to collaboratively select sensors to maximize data rates while maintaining fairness in their choices. We conduct an analysis of the reward regret and fairness regret of DC-ULCB, and prove that both regrets have logarithmic instance-dependent upper bounds. Additionally, through extensive simulations, we validate that DC-ULCB outperforms existing algorithms in maximizing reward and ensuring fairness.
