CAFE: Carbon-Aware Federated Learning in Geographically Distributed Data Centers
Jieming Bian, Lei Wang, Shaolei Ren, Jie Xu
TL;DR
This work tackles the carbon footprint challenge of training large AI models across geo-distributed data centers by formulating Carbon-Aware Federated Learning (CAFE). It combines coreset-based learning utility, a Lyapunov drift-plus-penalty online optimization, and submodular maximization to select data centers under a fixed carbon budget. The framework provides theoretical guarantees and practical algorithms (deterministic and randomized double greedy) to balance learning performance with emissions, demonstrated via simulations on real carbon-intensity data and CIFAR tasks. The results show CAFE can outperform baselines in learning accuracy while respecting environmental constraints, offering a scalable approach for green AI in distributed infrastructures.
Abstract
Training large-scale artificial intelligence (AI) models demands significant computational power and energy, leading to increased carbon footprint with potential environmental repercussions. This paper delves into the challenges of training AI models across geographically distributed (geo-distributed) data centers, emphasizing the balance between learning performance and carbon footprint. We consider Federated Learning (FL) as a solution, which prioritizes model parameter exchange over raw data, ensuring data privacy and compliance with local regulations. Given the variability in carbon intensity across regions, we propose a new framework called CAFE (short for Carbon-Aware Federated Learning) to optimize training within a fixed carbon footprint budget. Our approach incorporates coreset selection to assess learning performance, employs the Lyapunov drift-plus-penalty framework to address the unpredictability of future carbon intensity, and devises an efficient algorithm to address the combinatorial complexity of the data center selection. Through extensive simulations using real-world carbon intensity data, we demonstrate the efficacy of our algorithm, highlighting its superiority over existing methods in optimizing learning performance while minimizing environmental impact.
