Pure Exploration in Asynchronous Federated Bandits

Zichen Wang; Chuanhao Li; Chenyu Song; Lianghui Wang; Quanquan Gu; Huazheng Wang

Pure Exploration in Asynchronous Federated Bandits

Zichen Wang, Chuanhao Li, Chenyu Song, Lianghui Wang, Quanquan Gu, Huazheng Wang

TL;DR

This work proposes the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence, and shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment.

Abstract

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where $M$ agents cooperatively identify the best arm via communicating with the central server. To enhance the robustness against latency and unavailability of agents that are common in practice, we propose the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence. Our theoretical analysis shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment. Moreover, experimental results based on synthetic and real-world data empirically elucidate the effectiveness and communication cost-efficiency of the proposed algorithms.

Pure Exploration in Asynchronous Federated Bandits

TL;DR

Abstract

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where

agents cooperatively identify the best arm via communicating with the central server. To enhance the robustness against latency and unavailability of agents that are common in practice, we propose the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence. Our theoretical analysis shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment. Moreover, experimental results based on synthetic and real-world data empirically elucidate the effectiveness and communication cost-efficiency of the proposed algorithms.

Paper Structure (36 sections, 20 theorems, 148 equations, 2 figures)

This paper contains 36 sections, 20 theorems, 148 equations, 2 figures.

INTRODUCTION
RELATED WORK
PRELIMINARIES
Federated Bandits
Learning Objective
Communication Model and Asynchronous Environment
ASYNCHRONOUS ALGORITHMS FOR FEDERATED MAB
Low switching cost
Design of communication event
Proof sketch of Theorem \ref{['theorem1']}
ASYNCHRONOUS ALGORITHM FOR FEDERATED LINEAR BANDITS
FALinPE algorithm
Design of communication events of FALinPE
Arm selection strategy
EXPERIMENTS
...and 21 more sections

Key Result

Theorem 1

With $\gamma = 1/(2MK)$ and exploration bonuses the estimated best arm $\hat{k}^*$ of FAMABPE can satisfy condition (1) and with probability at least $1-\delta$ the sample complexity can be bounded by where is the problem complexity in the MAB Gabillon2012BestAI and The communication cost satisfies $\mathcal{C}(\tau) = \tilde{O}( KM )$.

Figures (2)

Figure 1: Synthetic data: Experimental results for federated MAB and federated linear bandits.
Figure 2: Experimental results on MovieLens for federated linear bandits.

Theorems & Definitions (35)

Theorem 1
Remark 1
Theorem 2
Remark 2
Remark 3: Global and local data in the federated MAB
Lemma 1: Communication cost
proof : Proof of Lemma \ref{['lemmacommunication1']}
Lemma 2
Lemma 3
Lemma 4
...and 25 more

Pure Exploration in Asynchronous Federated Bandits

TL;DR

Abstract

Pure Exploration in Asynchronous Federated Bandits

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (35)