FedPop: Federated Population-based Hyperparameter Tuning
Haokun Chen, Denis Krompass, Jindong Gu, Volker Tresp
TL;DR
This work tackles hyperparameter tuning in Federated Learning under tight communication and data distribution constraints by introducing FedPop, an online population-based hyperparameter tuning framework. FedPop uses evolutionary updates to optimize both server and client HP vectors across a population, with FedPop-G handling inter-configuration tuning and FedPop-L handling intra-configuration tuning in a decentralized, asynchronous manner. Empirical results across CIFAR-10, FEMNIST, Shakespeare, real-world cross-silo benchmarks, and full-scale ImageNet-1K demonstrate state-of-the-art performance, robustness to Non-IID data, and compatibility with multiple Fed-Opt methods. The approach reduces tuning overhead by avoiding retraining and expands the HP search space, offering practical benefits for deploying scalable FL systems.
Abstract
Federated Learning (FL) is a distributed machine learning (ML) paradigm, in which multiple clients collaboratively train ML models without centralizing their local data. Similar to conventional ML pipelines, the client local optimization and server aggregation procedure in FL are sensitive to the hyperparameter (HP) selection. Despite extensive research on tuning HPs for centralized ML, these methods yield suboptimal results when employed in FL. This is mainly because their "training-after-tuning" framework is unsuitable for FL with limited client computation power. While some approaches have been proposed for HP-Tuning in FL, they are limited to the HPs for client local updates. In this work, we propose a novel HP-tuning algorithm, called Federated Population-based Hyperparameter Tuning (FedPop), to address this vital yet challenging problem. FedPop employs population-based evolutionary algorithms to optimize the HPs, which accommodates various HP types at both the client and server sides. Compared with prior tuning methods, FedPop employs an online "tuning-while-training" framework, offering computational efficiency and enabling the exploration of a broader HP search space. Our empirical validation on the common FL benchmarks and complex real-world FL datasets, including full-sized Non-IID ImageNet-1K, demonstrates the effectiveness of the proposed method, which substantially outperforms the concurrent state-of-the-art HP-tuning methods in FL.
