Table of Contents
Fetching ...

Optimization of Federated Learning's Client Selection for Non-IID Data Based on Grey Relational Analysis

Shuaijun Chen, Omid Tavallaie, Michael Henri Hambali, Seid Miad Zandavi, Hamed Haddadi, Nicholas Lane, Song Guo, Albert Y. Zomaya

TL;DR

This work tackles federated learning under extreme data non-IIDness and heterogeneous device resources by introducing FedGRA, a Grey Relational Analysis–based client selection framework. FedGRA jointly considers CPU/RAM availability, local training loss, and weight divergence, applying normalization, GRC, and entropy-based weighting to derive a Grey Relational Grade that ranks clients while enforcing fairness. Implemented on AWS with TensorFlow and evaluated on MNIST/FMNISt using 2NN and CNN, FedGRA consistently achieves faster convergence and lower waiting times than FedAvg and Pow-d, including under severe non-IID conditions. The approach offers a scalable, low-overhead solution for practical FL deployments that require efficient training and fair participation across diverse devices.

Abstract

Federated learning (FL) is a novel distributed learning framework designed for applications with privacy-sensitive data. Without sharing data, FL trains local models on individual devices and constructs the global model on the server by performing model aggregation. However, to reduce the communication cost, the participants in each training round are randomly selected, which significantly decreases the training efficiency under data and device heterogeneity. To address this issue, in this paper, we introduce a novel approach that considers the data distribution and computational resources of devices to select the clients for each training round. Our proposed method performs client selection based on the Grey Relational Analysis (GRA) theory by considering available computational resources for each client, the training loss, and weight divergence. To examine the usability of our proposed method, we implement our contribution on Amazon Web Services (AWS) by using the TensorFlow library of Python. We evaluate our algorithm's performance in different setups by varying the learning rate, network size, the number of selected clients, and the client selection round. The evaluation results show that our proposed algorithm enhances the performance significantly in terms of test accuracy and the average client's waiting time compared to state-of-the-art methods, federated averaging and Pow-d.

Optimization of Federated Learning's Client Selection for Non-IID Data Based on Grey Relational Analysis

TL;DR

This work tackles federated learning under extreme data non-IIDness and heterogeneous device resources by introducing FedGRA, a Grey Relational Analysis–based client selection framework. FedGRA jointly considers CPU/RAM availability, local training loss, and weight divergence, applying normalization, GRC, and entropy-based weighting to derive a Grey Relational Grade that ranks clients while enforcing fairness. Implemented on AWS with TensorFlow and evaluated on MNIST/FMNISt using 2NN and CNN, FedGRA consistently achieves faster convergence and lower waiting times than FedAvg and Pow-d, including under severe non-IID conditions. The approach offers a scalable, low-overhead solution for practical FL deployments that require efficient training and fair participation across diverse devices.

Abstract

Federated learning (FL) is a novel distributed learning framework designed for applications with privacy-sensitive data. Without sharing data, FL trains local models on individual devices and constructs the global model on the server by performing model aggregation. However, to reduce the communication cost, the participants in each training round are randomly selected, which significantly decreases the training efficiency under data and device heterogeneity. To address this issue, in this paper, we introduce a novel approach that considers the data distribution and computational resources of devices to select the clients for each training round. Our proposed method performs client selection based on the Grey Relational Analysis (GRA) theory by considering available computational resources for each client, the training loss, and weight divergence. To examine the usability of our proposed method, we implement our contribution on Amazon Web Services (AWS) by using the TensorFlow library of Python. We evaluate our algorithm's performance in different setups by varying the learning rate, network size, the number of selected clients, and the client selection round. The evaluation results show that our proposed algorithm enhances the performance significantly in terms of test accuracy and the average client's waiting time compared to state-of-the-art methods, federated averaging and Pow-d.
Paper Structure (35 sections, 17 equations, 15 figures, 7 tables, 2 algorithms)

This paper contains 35 sections, 17 equations, 15 figures, 7 tables, 2 algorithms.

Figures (15)

  • Figure 1: Participation of clients with heterogeneous data and hardware in the training process of 1) traditional centralized training, 2) federated learning, and 3) federated learning with client selection.
  • Figure 2: An optimal selection for 50% of clients in a round of training by considering computational resources and data quality.
  • Figure 3: Training time for different Amazon EC2 instances.
  • Figure 4: The long tail distribution for the number of samples per class.
  • Figure 5: Evaluating FedAvg performance under varied data distribution and the number of selected clients.
  • ...and 10 more figures