FedZero: Leveraging Renewable Excess Energy in Federated Learning
Philipp Wiesner, Ramin Khalili, Dennis Grinwald, Pratik Agrawal, Lauritz Thamsen, Odej Kao
TL;DR
FedZero tackles the high energy footprint of federated learning by restricting training to renewable excess energy and spare compute, thereby aiming for zero operational emissions. It introduces forecasting-driven client selection and a power-domain sharing mechanism to achieve fast convergence while ensuring fair participation across heterogeneous clients. The method relies on a mixed-integer program to select $n$ clients per round under per-domain energy budgets $r_{e,t}$ and per-client min/max participation $m_c^{min}$ and $m_c^{max}$, with fairness enforced via a blocklist. Evaluations on real solar traces and multiple datasets show FedZero often outperforms baselines in time-to-accuracy and energy-to-accuracy, is robust to forecast errors, and scales to large numbers of clients, indicating practical potential for green FL at scale.
Abstract
Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without data sharing. Yet, FL inevitably introduces inefficiencies compared to centralized model training, which will further increase the already high energy usage and associated carbon emissions of machine learning in the future. One idea to reduce FL's carbon footprint is to schedule training jobs based on the availability of renewable excess energy that can occur at certain times and places in the grid. However, in the presence of such volatile and unreliable resources, existing FL schedulers cannot always ensure fast, efficient, and fair training. We propose FedZero, an FL system that operates exclusively on renewable excess energy and spare capacity of compute infrastructure to effectively reduce a training's operational carbon emissions to zero. Using energy and load forecasts, FedZero leverages the spatio-temporal availability of excess resources by selecting clients for fast convergence and fair participation. Our evaluation, based on real solar and load traces, shows that FedZero converges significantly faster than existing approaches under the mentioned constraints while consuming less energy. Furthermore, it is robust to forecasting errors and scalable to tens of thousands of clients.
