Table of Contents
Fetching ...

Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning

Xi Chen, Qin Li, Haibin Cai, Ting Wang

TL;DR

HAPFL addresses the straggler and heterogeneity problems in federated learning for IoT by deploying two PPO-based reinforcement learning agents: one for adaptive heterogeneous model allocation and another for per-client training-intensity adjustment. A LiteModel is introduced to enable mutual knowledge distillation between a universal, homogeneous model and heterogeneous local models, while aggregation weighs updates by information entropy and local accuracy. Theoretical convergence analyses for convex and non-convex settings are provided, and extensive experiments on MNIST, CIFAR-10, and ImageNet-10 show that HAPFL achieves higher accuracy, shorter overall training time, and substantial reductions in straggling latency compared to FedAvg, FedProx, pFedMe, and FedDdrl. The approach offers practical benefits for scalable, privacy-preserving learning in diverse, resource-constrained IoT environments.

Abstract

Federated Learning (FL) empowers multiple clients to collaboratively train machine learning models without sharing local data, making it highly applicable in heterogeneous Internet of Things (IoT) environments. However, intrinsic heterogeneity in clients' model architectures and computing capabilities often results in model accuracy loss and the intractable straggler problem, which significantly impairs training effectiveness. To tackle these challenges, this paper proposes a novel Heterogeneity-aware Personalized Federated Learning method, named HAPFL, via multi-level Reinforcement Learning (RL) mechanisms. HAPFL optimizes the training process by incorporating three strategic components: 1) An RL-based heterogeneous model allocation mechanism. The parameter server employs a Proximal Policy Optimization (PPO)-based RL agent to adaptively allocate appropriately sized, differentiated models to clients based on their performance, effectively mitigating performance disparities. 2) An RL-based training intensity adjustment scheme. The parameter server leverages another PPO-based RL agent to dynamically fine-tune the training intensity for each client to further enhance training efficiency and reduce straggling latency. 3) A knowledge distillation-based mutual learning mechanism. Each client deploys both a heterogeneous local model and a homogeneous lightweight model named LiteModel, where these models undergo mutual learning through knowledge distillation. This uniform LiteModel plays a pivotal role in aggregating and sharing global knowledge, significantly enhancing the effectiveness of personalized local training. Experimental results across multiple benchmark datasets demonstrate that HAPFL not only achieves high accuracy but also substantially reduces the overall training time by 20.9%-40.4% and decreases straggling latency by 19.0%-48.0% compared to existing solutions.

Heterogeneity-aware Personalized Federated Learning via Adaptive Dual-Agent Reinforcement Learning

TL;DR

HAPFL addresses the straggler and heterogeneity problems in federated learning for IoT by deploying two PPO-based reinforcement learning agents: one for adaptive heterogeneous model allocation and another for per-client training-intensity adjustment. A LiteModel is introduced to enable mutual knowledge distillation between a universal, homogeneous model and heterogeneous local models, while aggregation weighs updates by information entropy and local accuracy. Theoretical convergence analyses for convex and non-convex settings are provided, and extensive experiments on MNIST, CIFAR-10, and ImageNet-10 show that HAPFL achieves higher accuracy, shorter overall training time, and substantial reductions in straggling latency compared to FedAvg, FedProx, pFedMe, and FedDdrl. The approach offers practical benefits for scalable, privacy-preserving learning in diverse, resource-constrained IoT environments.

Abstract

Federated Learning (FL) empowers multiple clients to collaboratively train machine learning models without sharing local data, making it highly applicable in heterogeneous Internet of Things (IoT) environments. However, intrinsic heterogeneity in clients' model architectures and computing capabilities often results in model accuracy loss and the intractable straggler problem, which significantly impairs training effectiveness. To tackle these challenges, this paper proposes a novel Heterogeneity-aware Personalized Federated Learning method, named HAPFL, via multi-level Reinforcement Learning (RL) mechanisms. HAPFL optimizes the training process by incorporating three strategic components: 1) An RL-based heterogeneous model allocation mechanism. The parameter server employs a Proximal Policy Optimization (PPO)-based RL agent to adaptively allocate appropriately sized, differentiated models to clients based on their performance, effectively mitigating performance disparities. 2) An RL-based training intensity adjustment scheme. The parameter server leverages another PPO-based RL agent to dynamically fine-tune the training intensity for each client to further enhance training efficiency and reduce straggling latency. 3) A knowledge distillation-based mutual learning mechanism. Each client deploys both a heterogeneous local model and a homogeneous lightweight model named LiteModel, where these models undergo mutual learning through knowledge distillation. This uniform LiteModel plays a pivotal role in aggregating and sharing global knowledge, significantly enhancing the effectiveness of personalized local training. Experimental results across multiple benchmark datasets demonstrate that HAPFL not only achieves high accuracy but also substantially reduces the overall training time by 20.9%-40.4% and decreases straggling latency by 19.0%-48.0% compared to existing solutions.

Paper Structure

This paper contains 29 sections, 52 equations, 25 figures, 4 tables, 1 algorithm.

Figures (25)

  • Figure 1: Overview of HAPFL architecture
  • Figure 2: Reward for PPO1
  • Figure 3: Reward for PPO2
  • Figure 4: LiteModel on MNIST
  • Figure 5: LiteModel on MNIST
  • ...and 20 more figures