Table of Contents
Fetching ...

Noise-aware Client Selection for carbon-efficient Federated Learning via Gradient Norm Thresholding

Patrick Wilhelm, Inese Yilmaz, Odej Kao

TL;DR

This paper introduces a modular approach on top to state-of-the-art client selection strategies for carbon-efficient Federated Learning, and enhances robustness by incorporating a noisy client data filtering, improving both model performance and sustainability in scenarios with unknown data quality.

Abstract

Training large-scale Neural Networks requires substantial computational power and energy. Federated Learning enables distributed model training across geospatially distributed data centers, leveraging renewable energy sources to reduce the carbon footprint of AI training. Various client selection strategies have been developed to align the volatility of renewable energy with stable and fair model training in a federated system. However, due to the privacy-preserving nature of Federated Learning, the quality of data on client devices remains unknown, posing challenges for effective model training. In this paper, we introduce a modular approach on top to state-of-the-art client selection strategies for carbon-efficient Federated Learning. Our method enhances robustness by incorporating a noisy client data filtering, improving both model performance and sustainability in scenarios with unknown data quality. Additionally, we explore the impact of carbon budgets on model convergence, balancing efficiency and sustainability. Through extensive evaluations, we demonstrate that modern client selection strategies based on local client loss tend to select clients with noisy data, ultimately degrading model performance. To address this, we propose a gradient norm thresholding mechanism using probing rounds for more effective client selection and noise detection, contributing to the practical deployment of carbon-efficient Federated Learning.

Noise-aware Client Selection for carbon-efficient Federated Learning via Gradient Norm Thresholding

TL;DR

This paper introduces a modular approach on top to state-of-the-art client selection strategies for carbon-efficient Federated Learning, and enhances robustness by incorporating a noisy client data filtering, improving both model performance and sustainability in scenarios with unknown data quality.

Abstract

Training large-scale Neural Networks requires substantial computational power and energy. Federated Learning enables distributed model training across geospatially distributed data centers, leveraging renewable energy sources to reduce the carbon footprint of AI training. Various client selection strategies have been developed to align the volatility of renewable energy with stable and fair model training in a federated system. However, due to the privacy-preserving nature of Federated Learning, the quality of data on client devices remains unknown, posing challenges for effective model training. In this paper, we introduce a modular approach on top to state-of-the-art client selection strategies for carbon-efficient Federated Learning. Our method enhances robustness by incorporating a noisy client data filtering, improving both model performance and sustainability in scenarios with unknown data quality. Additionally, we explore the impact of carbon budgets on model convergence, balancing efficiency and sustainability. Through extensive evaluations, we demonstrate that modern client selection strategies based on local client loss tend to select clients with noisy data, ultimately degrading model performance. To address this, we propose a gradient norm thresholding mechanism using probing rounds for more effective client selection and noise detection, contributing to the practical deployment of carbon-efficient Federated Learning.
Paper Structure (10 sections, 1 equation, 8 figures)

This paper contains 10 sections, 1 equation, 8 figures.

Figures (8)

  • Figure 1: System Overview: Clients represent datacenters in 30 regions within the US. While the data quality is unknown we introduce probing rounds in the beginning of federated training to identify clients with noisy data. Integrating carbon budgets we can balance clients with high data utility and their current carbon intensity.
  • Figure 2: Assigned carbon intensity values for clients over 100 training rounds. Each round corresponds to one hour, spanning from 2023-01-15 00:00:00 to 2023-01-19 04:00:00 (UTC). The bottom plot highlights daytime intensity patterns for three clients from different regions.
  • Figure 3: Client selection counts, 6 out of 30 clients contain noisy data.
  • Figure 4: Convergence curves under noisy data scenario. Baseline methods are compared with their thresholded variants. Thresholding reduces the impact of corrupted clients, resulting in faster and more stable convergence.
  • Figure 5: Accuracy–emissions comparison across methods. Bars show the carbon emissions spent during training up to the point of maximum accuracy, while the line shows the corresponding maximum accuracy.
  • ...and 3 more figures