Greedy Shapley Client Selection for Communication-Efficient Federated Learning
Pranava Singhal, Shashi Raj Pandey, Petar Popovski
TL;DR
This paper tackles efficient Federated Learning under strict communication budgets and client heterogeneity by introducing GreedyFed, a biased client selection method based on cumulative Shapley-Value (SV). It leverages a fast Monte Carlo SV approximation, GTG-Shapley, to make SV computation scalable to many clients, and adopts a two-stage selection: round-robin initialization followed by greedy selection of the top-$M$ contributors, with variants in SV averaging. The approach yields faster convergence with high accuracy and lower variance than baselines across multiple datasets, under data, system, and privacy heterogeneity and timing constraints. Practically, GreedyFed reduces communication rounds while maintaining model performance, offering a robust solution for real-world FL deployments with constrained communication opportunities.
Abstract
The standard client selection algorithms for Federated Learning (FL) are often unbiased and involve uniform random sampling of clients. This has been proven sub-optimal for fast convergence under practical settings characterized by significant heterogeneity in data distribution, computing, and communication resources across clients. For applications having timing constraints due to limited communication opportunities with the parameter server (PS), the client selection strategy is critical to complete model training within the fixed budget of communication rounds. To address this, we develop a biased client selection strategy, GreedyFed, that identifies and greedily selects the most contributing clients in each communication round. This method builds on a fast approximation algorithm for the Shapley Value at the PS, making the computation tractable for real-world applications with many clients. Compared to various client selection strategies on several real-world datasets, GreedyFed demonstrates fast and stable convergence with high accuracy under timing constraints and when imposing a higher degree of heterogeneity in data distribution, systems constraints, and privacy requirements.
