BIPPO: Budget-Aware Independent PPO for Energy-Efficient Federated Learning Services
Anna Lackinger, Andrea Morichetta, Pantelis A. Frangoudis, Schahram Dustdar
TL;DR
Budget-aware IPPO (BIPPO) tackles energy-constrained IoT-Federated Learning by learning a client-selection policy under a fixed energy budget. It deploys an Independent PPO framework within a cooperative MARL setting, featuring an improved sampler that respects budgets and reduces RL training energy. Across Fashion-MNIST and CIFAR-10, BIPPO achieves higher mean accuracy and stronger stability than PPO, IPPO, FPPO, and heuristic baselines, while maintaining energy use that is negligible and largely independent of the number of participating clients. The approach scales to dozens of devices, remains robust under client churn, and is complemented by transfer-learning and practical considerations for real-world deployment.
Abstract
Federated Learning (FL) is a promising machine learning solution in large-scale IoT systems, guaranteeing load distribution and privacy. However, FL does not natively consider infrastructure efficiency, a critical concern for systems operating in resource-constrained environments. Several Reinforcement Learning (RL) based solutions offer improved client selection for FL; however, they do not consider infrastructure challenges, such as resource limitations and device churn. Furthermore, the training of RL methods is often not designed for practical application, as these approaches frequently do not consider generalizability and are not optimized for energy efficiency. To fill this gap, we propose BIPPO (Budget-aware Independent Proximal Policy Optimization), which is an energy-efficient multi-agent RL solution that improves performance. We evaluate BIPPO on two image classification tasks run in a highly budget-constrained setting, with FL clients training on non-IID data, a challenging context for vanilla FL. The improved sampler of BIPPO enables it to increase the mean accuracy compared to non-RL mechanisms, traditional PPO, and IPPO. In addition, BIPPO only consumes a negligible proportion of the budget, which stays consistent even if the number of clients increases. Overall, BIPPO delivers a performant, stable, scalable, and sustainable solution for client selection in IoT-FL.
