Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)

Ziqun Chen; Kechao Cai; Jinbei Zhang; Zhigang Yu

Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)

Ziqun Chen, Kechao Cai, Jinbei Zhang, Zhigang Yu

TL;DR

This work tackles data collection in IoT edge networks where multiple servers must cooperatively learn sensor data rates under unknown channel conditions. It models the problem as a multiplayer multi-armed bandit with collisions and fairness constraints, and introduces the DC-ULCB algorithm that combines initialization, running consensus for rate estimation, and cyclic fair sensor assignment. The authors prove instance-dependent logarithmic upper bounds on reward and fairness regrets and demonstrate via simulations that DC-ULCB outperforms existing collision-aware and cooperative methods. The approach enables fair, efficient data gathering in distributed IoT systems with limited inter-server communication, improving both throughput and equity among servers.

Abstract

In intelligent Internet of Things (IoT) systems, edge servers within a network exchange information with their neighbors and collect data from sensors to complete delivered tasks. In this paper, we propose a multiplayer multi-armed bandit model for intelligent IoT systems to facilitate data collection and incorporate fairness considerations. In our model, we establish an effective communication protocol that helps servers cooperate with their neighbors. Then we design a distributed cooperative bandit algorithm, DC-ULCB, enabling servers to collaboratively select sensors to maximize data rates while maintaining fairness in their choices. We conduct an analysis of the reward regret and fairness regret of DC-ULCB, and prove that both regrets have logarithmic instance-dependent upper bounds. Additionally, through extensive simulations, we validate that DC-ULCB outperforms existing algorithms in maximizing reward and ensuring fairness.

Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)

TL;DR

Abstract

Paper Structure (13 sections, 3 theorems, 37 equations, 1 figure, 3 algorithms)

This paper contains 13 sections, 3 theorems, 37 equations, 1 figure, 3 algorithms.

Introduction
Related Work
Model Description
Algorithm Design
Initialization Scheme
Running Consensus for Data Rate Estimation
DC-ULCB Algorithm
Regret Analysis
Numerical Experiments
Conclusion
Description of the Initialization Scheme
Proof of Lemma \ref{['pro:mmab:performance-running-consensus']}
Proof of Theorem \ref{['the:mmab:subopt-upper-bound']}

Key Result

Theorem 1

With probability at least $1-\delta_0$, by running the initialization scheme $\text{INIT}(N, \delta_0)$, all servers can learn $M$ and get a distinct rank from $1$ to $M$ using $N \ln(e^2N / \delta_0)$ time slots.

Figures (1)

Figure 1: Experiment results on DC-ULCB

Theorems & Definitions (5)

Theorem 1
Lemma 1
Theorem 2
proof
proof

Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)

TL;DR

Abstract

Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (5)