Table of Contents
Fetching ...

Flow: Per-Instance Personalized Federated Learning Through Dynamic Routing

Kunjal Panchal, Sunav Choudhary, Nisarg Parikh, Lijun Zhang, Hui Guan

TL;DR

Flow addresses non-IID heterogeneity in federated learning by introducing per-instance routing between a global model and a client-specific local model. It constructs a dynamic personalized model $w_p$ per client using a routing module $\psi_g$ that decides, for each input, whether to use $w_g$ or $w_\ell$, with data split into $\zeta_{m,\ell}$ and $\zeta_{m,g}$ and FedAvg-based server aggregation. The approach includes an explicit convergence analysis for both global and personalized models and demonstrates, through extensive cross-domain experiments on language and vision tasks, that Flow improves both generalized and personalized accuracy while remaining scalable and friendly to new clients. Together, these results indicate that per-instance dynamic routing can meaningfully enhance personalization in large-scale, cross-device FL with practical deployment benefits.

Abstract

Personalization in Federated Learning (FL) aims to modify a collaboratively trained global model according to each client. Current approaches to personalization in FL are at a coarse granularity, i.e. all the input instances of a client use the same personalized model. This ignores the fact that some instances are more accurately handled by the global model due to better generalizability. To address this challenge, this work proposes Flow, a fine-grained stateless personalized FL approach. Flow creates dynamic personalized models by learning a routing mechanism that determines whether an input instance prefers the local parameters or its global counterpart. Thus, Flow introduces per-instance routing in addition to leveraging per-client personalization to improve accuracies at each client. Further, Flow is stateless which makes it unnecessary for a client to retain its personalized state across FL rounds. This makes Flow practical for large-scale FL settings and friendly to newly joined clients. Evaluations on Stackoverflow, Reddit, and EMNIST datasets demonstrate the superiority in prediction accuracy of Flow over state-of-the-art non-personalized and only per-client personalized approaches to FL.

Flow: Per-Instance Personalized Federated Learning Through Dynamic Routing

TL;DR

Flow addresses non-IID heterogeneity in federated learning by introducing per-instance routing between a global model and a client-specific local model. It constructs a dynamic personalized model per client using a routing module that decides, for each input, whether to use or , with data split into and and FedAvg-based server aggregation. The approach includes an explicit convergence analysis for both global and personalized models and demonstrates, through extensive cross-domain experiments on language and vision tasks, that Flow improves both generalized and personalized accuracy while remaining scalable and friendly to new clients. Together, these results indicate that per-instance dynamic routing can meaningfully enhance personalization in large-scale, cross-device FL with practical deployment benefits.

Abstract

Personalization in Federated Learning (FL) aims to modify a collaboratively trained global model according to each client. Current approaches to personalization in FL are at a coarse granularity, i.e. all the input instances of a client use the same personalized model. This ignores the fact that some instances are more accurately handled by the global model due to better generalizability. To address this challenge, this work proposes Flow, a fine-grained stateless personalized FL approach. Flow creates dynamic personalized models by learning a routing mechanism that determines whether an input instance prefers the local parameters or its global counterpart. Thus, Flow introduces per-instance routing in addition to leveraging per-client personalization to improve accuracies at each client. Further, Flow is stateless which makes it unnecessary for a client to retain its personalized state across FL rounds. This makes Flow practical for large-scale FL settings and friendly to newly joined clients. Evaluations on Stackoverflow, Reddit, and EMNIST datasets demonstrate the superiority in prediction accuracy of Flow over state-of-the-art non-personalized and only per-client personalized approaches to FL.
Paper Structure (43 sections, 17 theorems, 119 equations, 14 figures, 8 tables, 1 algorithm)

This paper contains 43 sections, 17 theorems, 119 equations, 14 figures, 8 tables, 1 algorithm.

Key Result

Theorem 4.1

If each client's objective function $f_m$ (and hence the global objective function $F$) satisfies $\beta$-smoothness, $\sigma_\ell$-bounded local gradient variance, $(G,B)$-dissimilarity assumptions, using the learning rate $\frac{1}{2\beta} \leq \eta_\ell \leq \frac{1}{2 \sqrt{5} \beta B K^2 \sqrt{

Figures (14)

  • Figure 2: $w_{g}$ and $w_{p}$ accuracies for Stackoverflow.
  • Figure 3: Behavior of the routing policy from $\psi_{g}$ for all instances at each layer for Stackoverflow.
  • Figure 4: Ablation studies on Stackoverflow dataset.
  • Figure 5: Learning curves on Generalized Accuracy Metric of Flow and its baselines.
  • Figure 6: Learning curves on Personalized Accuracy Metric of Flow and its baselines.
  • ...and 9 more figures

Theorems & Definitions (33)

  • Theorem 4.1: Convergence of the Global Model
  • Theorem 4.2: Convergence of the Personalized Model
  • Definition E.5: Gradient Diversity
  • Lemma E.6: Local model progress
  • proof
  • Lemma E.7: Local version of the global model progress
  • proof
  • Lemma E.8: Deviation of the personalized model from the global model
  • proof
  • Theorem E.9: Convergence of the Global Model for Convex Cases
  • ...and 23 more