Table of Contents
Fetching ...

Federated Learning with Workload Reduction through Partial Training of Client Models and Entropy-Based Data Selection

Hongrui Shi, Valentin Radu, Po Yang

TL;DR

FedFT-EDS, a novel approach that combines Fine-Tuning of partial client models with Entropy-based Data Selection to reduce training workloads on edge devices, underscores the critical role of data selection in Federated Learning and offers a promising direction for achieving scalable and efficient FL systems.

Abstract

With the rapid expansion of edge devices, such as IoT devices, where crucial data needed for machine learning applications is generated, it becomes essential to promote their participation in privacy-preserving Federated Learning (FL) systems. The best way to achieve this desiderate is by reducing their training workload to match their constrained computational resources. While prior FL research has address the workload constrains by introducing lightweight models on the edge, limited attention has been given to optimizing on-device training efficiency through reducing the amount of data need during training. In this work, we propose FedFT-EDS, a novel approach that combines Fine-Tuning of partial client models with Entropy-based Data Selection to reduce training workloads on edge devices. By actively selecting the most informative local instances for learning, FedFT-EDS reduces training data significantly in FL and demonstrates that not all user data is equally beneficial for FL on all rounds. Our experiments on CIFAR-10 and CIFAR-100 show that FedFT-EDS uses only 50% user data while improving the global model performance compared to baseline methods, FedAvg and FedProx. Importantly, FedFT-EDS improves client learning efficiency by up to 3 times, using one third of training time on clients to achieve an equivalent performance to the baselines. This work highlights the importance of data selection in FL and presents a promising pathway to scalable and efficient Federate Learning.

Federated Learning with Workload Reduction through Partial Training of Client Models and Entropy-Based Data Selection

TL;DR

FedFT-EDS, a novel approach that combines Fine-Tuning of partial client models with Entropy-based Data Selection to reduce training workloads on edge devices, underscores the critical role of data selection in Federated Learning and offers a promising direction for achieving scalable and efficient FL systems.

Abstract

With the rapid expansion of edge devices, such as IoT devices, where crucial data needed for machine learning applications is generated, it becomes essential to promote their participation in privacy-preserving Federated Learning (FL) systems. The best way to achieve this desiderate is by reducing their training workload to match their constrained computational resources. While prior FL research has address the workload constrains by introducing lightweight models on the edge, limited attention has been given to optimizing on-device training efficiency through reducing the amount of data need during training. In this work, we propose FedFT-EDS, a novel approach that combines Fine-Tuning of partial client models with Entropy-based Data Selection to reduce training workloads on edge devices. By actively selecting the most informative local instances for learning, FedFT-EDS reduces training data significantly in FL and demonstrates that not all user data is equally beneficial for FL on all rounds. Our experiments on CIFAR-10 and CIFAR-100 show that FedFT-EDS uses only 50% user data while improving the global model performance compared to baseline methods, FedAvg and FedProx. Importantly, FedFT-EDS improves client learning efficiency by up to 3 times, using one third of training time on clients to achieve an equivalent performance to the baselines. This work highlights the importance of data selection in FL and presents a promising pathway to scalable and efficient Federate Learning.
Paper Structure (26 sections, 6 equations, 10 figures, 4 tables, 1 algorithm)

This paper contains 26 sections, 6 equations, 10 figures, 4 tables, 1 algorithm.

Figures (10)

  • Figure 1: The workflow of our proposed FedFT-EDS. A global pretrained model is split into two parts, one frozen feature extractor and one trainable upper part of the model. The entire model is shared with each client, but only the trainable upper part is updated. Clients also select their most valuable training samples by ranking them based on our hardened softmax activation.
  • Figure 2: Heatmaps of the CKA similarity in the scenario of 10 clients and $Diri(0.1)$. A darker entry implies a higher similarity between the paired models indexed by the coordinate, suggesting they are less deviated from each other on heterogeneous data.
  • Figure 3: Heat maps of the CKA similarity in the scenario of 10 clients and $Diri(0.5)$.
  • Figure 4: Averaged CKA similarity at different layer levels across the locally trained models.
  • Figure 5: Learning curves of FedFT-EDS and baselines, with global model accuracies computed over test data.
  • ...and 5 more figures