Federated Linear Contextual Bandits with Heterogeneous Clients
Ethan Blaser, Chuanhao Li, Hongning Wang
TL;DR
This work extends federated bandit learning to heterogeneous client populations by clustering clients and performing cluster-wise collaborative learning under a standard FL single-model broadcast. The HetoFedBandit framework comprises a pure-exploration phase for clustering, followed by an optimistic learning phase that leverages cross-client information within identified clusters, all coordinated via a FIFO cluster-level communication protocol. The authors establish clustering correctness with high probability, derive confidence ellipsoids, and bound both regret and communication cost, while also proposing empirical enhancements (data-dependent clustering and a priority queue) that improve performance in practice. Experiments on synthetic and LastFM datasets demonstrate that HetoFedBandit and its enhanced variant achieve sub-linear regret and lower communication costs compared to strong baselines, highlighting the approach's practical potential for private, distributed bandit learning with heterogeneity.
Abstract
The demand for collaborative and private bandit learning across multiple agents is surging due to the growing quantity of data generated from distributed systems. Federated bandit learning has emerged as a promising framework for private, efficient, and decentralized online learning. However, almost all previous works rely on strong assumptions of client homogeneity, i.e., all participating clients shall share the same bandit model; otherwise, they all would suffer linear regret. This greatly restricts the application of federated bandit learning in practice. In this work, we introduce a new approach for federated bandits for heterogeneous clients, which clusters clients for collaborative bandit learning under the federated learning setting. Our proposed algorithm achieves non-trivial sub-linear regret and communication cost for all clients, subject to the communication protocol under federated learning that at anytime only one model can be shared by the server.
