Learn How to Query from Unlabeled Data Streams in Federated Learning
Yuchang Sun, Xinran Li, Tao Lin, Jun Zhang
TL;DR
This work addresses the challenge of training in federated learning when unlabeled data arrive as streams and labeling is costly. It introduces LeaDQ, a multi-agent reinforcement learning framework that learns decentralized data-querying policies under centralized training and decentralized execution (CTDE), coordinating clients via a QMIX-based mixer to align local sample selections with global model objectives. By formulating data querying as a Dec-POMDP and using a shared held-out reward, LeaDQ enables cooperative data selection without data sharing, improving global model accuracy across image and text tasks in non-IID FL settings. The results show LeaDQ outperforms representative baselines and demonstrates robustness to varying data arrival rates and heterogeneity, with practical implications for privacy-preserving, label-efficient FL in streaming environments.
Abstract
Federated learning (FL) enables collaborative learning among decentralized clients while safeguarding the privacy of their local data. Existing studies on FL typically assume offline labeled data available at each client when the training starts. Nevertheless, the training data in practice often arrive at clients in a streaming fashion without ground-truth labels. Given the expensive annotation cost, it is critical to identify a subset of informative samples for labeling on clients. However, selecting samples locally while accommodating the global training objective presents a challenge unique to FL. In this work, we tackle this conundrum by framing the data querying process in FL as a collaborative decentralized decision-making problem and proposing an effective solution named LeaDQ, which leverages multi-agent reinforcement learning algorithms. In particular, under the implicit guidance from global information, LeaDQ effectively learns the local policies for distributed clients and steers them towards selecting samples that can enhance the global model's accuracy. Extensive simulations on image and text tasks show that LeaDQ advances the model performance in various FL scenarios, outperforming the benchmarking algorithms.
