Table of Contents
Fetching ...

Local Performance vs. Out-of-Distribution Generalization: An Empirical Analysis of Personalized Federated Learning in Heterogeneous Data Environments

Mortesa Hussaini, Jan Theiß, Anthony Stein

TL;DR

This work tackles the trade-off between local performance and out-of-distribution generalization in heterogeneous Federated Learning. It introduces Federated Learning with Individualized Updates (FLIU), a lightweight extension of FedAvg that optimizes a dual objective and uses adaptive personalized updates $θ_k^{t+1} = γ_k θ_k^t + (1-γ_k) Θ^t$ to balance global and local knowledge. The authors formalize the problem with a two-level objective and evaluate FLIU against FedAvg and other PFL methods on MNIST and CIFAR-10 under IID, pathological non-IID, and Dirichlet-based LS/QS/LSQS data environments, revealing a consistent local-generalization trade-off and the value of adaptive personalization. They find that FLIU often achieves strong performance across metrics and environments, outperforming certain baselines in extreme heterogeneity while maintaining robust generalization, with suggestions for future refinement of the personalization factor and hierarchical extensions. The study underlines the importance of evaluating both local performance and generalization to unseen distributions in FL, informing practical deployment in privacy-constrained, heterogeneous settings.

Abstract

In the context of Federated Learning with heterogeneous data environments, local models tend to converge to their own local model optima during local training steps, deviating from the overall data distributions. Aggregation of these local updates, e.g., with FedAvg, often does not align with the global model optimum (client drift), resulting in an update that is suboptimal for most clients. Personalized Federated Learning approaches address this challenge by exclusively focusing on the average local performances of clients' models on their own data distribution. Generalization to out-of-distribution samples, which is a substantial benefit of FedAvg and represents a significant component of robustness, appears to be inadequately incorporated into the assessment and evaluation processes. This study involves a thorough evaluation of Federated Learning approaches, encompassing both their local performance and their generalization capabilities. Therefore, we examine different stages within a single communication round to enable a more nuanced understanding of the considered metrics. Furthermore, we propose and incorporate a modified approach of FedAvg, designated as Federated Learning with Individualized Updates (FLIU), extending the algorithm by a straightforward individualization step with an adaptive personalization factor. We evaluate and compare the approaches empirically using MNIST and CIFAR-10 under various distributional conditions, including benchmark IID and pathological non-IID, as well as additional novel test environments with Dirichlet distribution specifically developed to stress the algorithms on complex data heterogeneity.

Local Performance vs. Out-of-Distribution Generalization: An Empirical Analysis of Personalized Federated Learning in Heterogeneous Data Environments

TL;DR

This work tackles the trade-off between local performance and out-of-distribution generalization in heterogeneous Federated Learning. It introduces Federated Learning with Individualized Updates (FLIU), a lightweight extension of FedAvg that optimizes a dual objective and uses adaptive personalized updates to balance global and local knowledge. The authors formalize the problem with a two-level objective and evaluate FLIU against FedAvg and other PFL methods on MNIST and CIFAR-10 under IID, pathological non-IID, and Dirichlet-based LS/QS/LSQS data environments, revealing a consistent local-generalization trade-off and the value of adaptive personalization. They find that FLIU often achieves strong performance across metrics and environments, outperforming certain baselines in extreme heterogeneity while maintaining robust generalization, with suggestions for future refinement of the personalization factor and hierarchical extensions. The study underlines the importance of evaluating both local performance and generalization to unseen distributions in FL, informing practical deployment in privacy-constrained, heterogeneous settings.

Abstract

In the context of Federated Learning with heterogeneous data environments, local models tend to converge to their own local model optima during local training steps, deviating from the overall data distributions. Aggregation of these local updates, e.g., with FedAvg, often does not align with the global model optimum (client drift), resulting in an update that is suboptimal for most clients. Personalized Federated Learning approaches address this challenge by exclusively focusing on the average local performances of clients' models on their own data distribution. Generalization to out-of-distribution samples, which is a substantial benefit of FedAvg and represents a significant component of robustness, appears to be inadequately incorporated into the assessment and evaluation processes. This study involves a thorough evaluation of Federated Learning approaches, encompassing both their local performance and their generalization capabilities. Therefore, we examine different stages within a single communication round to enable a more nuanced understanding of the considered metrics. Furthermore, we propose and incorporate a modified approach of FedAvg, designated as Federated Learning with Individualized Updates (FLIU), extending the algorithm by a straightforward individualization step with an adaptive personalization factor. We evaluate and compare the approaches empirically using MNIST and CIFAR-10 under various distributional conditions, including benchmark IID and pathological non-IID, as well as additional novel test environments with Dirichlet distribution specifically developed to stress the algorithms on complex data heterogeneity.

Paper Structure

This paper contains 16 sections, 4 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Schematic illustration of FLIU
  • Figure 2: Example of a Dirichlet distributed ($\alpha = \mathbf{1}$) allocation skew of labels to $K=10$ clients from CIFAR-10.
  • Figure 3: Example of a Dirichlet distributed ($\alpha = \mathbf{1}$) allocation skew of number of training samples to $K=10$ clients from CIFAR-10.
  • Figure 4: Example of a Dirichlet distributed ($\alpha = \mathbf{1}$) allocation skew of number of training samples and labels to $K=10$ clients from CIFAR-10.
  • Figure 5: Comparison of the trajectories of $\mathbf{Acc}\left(L\right)$ at stage $L_1$ of CLT, FedAvg and FLIU with adaptive $\gamma_k$ and several fixed $\gamma$s on MNIST with PATH for T=100 rounds.
  • ...and 3 more figures