Table of Contents
Fetching ...

Personalized Multi-tier Federated Learning

Sourasekhar Banerjee, Ali Dadras, Alp Yurtsever, Monowar Bhuyan

TL;DR

This work addresses data heterogeneity in federated learning by introducing PerMFL, a personalized multi-tier framework that learns a global model, per-team models, and per-device models within a hierarchical cloud-edge structure. The method formulates a three-level optimization with squared Euclidean penalties linking device, team, and global models, and solves it via device-, team-, and server-level updates that exploit Moreau envelopes to balance personalization with collaboration. Theoretical guarantees cover both smooth strongly convex and smooth non-convex losses, yielding linear convergence under suitable conditions and sublinear convergence to first-order stationary points, with explicit parameter bounds. Empirically, PerMFL demonstrates robust performance and fast convergence across MNIST, FMNIST, EMNIST, FEMNIST, CIFAR100, and synthetic non-IID data, often outperforming state-of-the-art FL methods and showing resilience to various team formations and participation scenarios, while reducing global-communication costs through intra-team updates.

Abstract

The key challenge of personalized federated learning (PerFL) is to capture the statistical heterogeneity properties of data with inexpensive communications and gain customized performance for participating devices. To address these, we introduced personalized federated learning in multi-tier architecture (PerMFL) to obtain optimized and personalized local models when there are known team structures across devices. We provide theoretical guarantees of PerMFL, which offers linear convergence rates for smooth strongly convex problems and sub-linear convergence rates for smooth non-convex problems. We conduct numerical experiments demonstrating the robust empirical performance of PerMFL, outperforming the state-of-the-art in multiple personalized federated learning tasks.

Personalized Multi-tier Federated Learning

TL;DR

This work addresses data heterogeneity in federated learning by introducing PerMFL, a personalized multi-tier framework that learns a global model, per-team models, and per-device models within a hierarchical cloud-edge structure. The method formulates a three-level optimization with squared Euclidean penalties linking device, team, and global models, and solves it via device-, team-, and server-level updates that exploit Moreau envelopes to balance personalization with collaboration. Theoretical guarantees cover both smooth strongly convex and smooth non-convex losses, yielding linear convergence under suitable conditions and sublinear convergence to first-order stationary points, with explicit parameter bounds. Empirically, PerMFL demonstrates robust performance and fast convergence across MNIST, FMNIST, EMNIST, FEMNIST, CIFAR100, and synthetic non-IID data, often outperforming state-of-the-art FL methods and showing resilience to various team formations and participation scenarios, while reducing global-communication costs through intra-team updates.

Abstract

The key challenge of personalized federated learning (PerFL) is to capture the statistical heterogeneity properties of data with inexpensive communications and gain customized performance for participating devices. To address these, we introduced personalized federated learning in multi-tier architecture (PerMFL) to obtain optimized and personalized local models when there are known team structures across devices. We provide theoretical guarantees of PerMFL, which offers linear convergence rates for smooth strongly convex problems and sub-linear convergence rates for smooth non-convex problems. We conduct numerical experiments demonstrating the robust empirical performance of PerMFL, outperforming the state-of-the-art in multiple personalized federated learning tasks.
Paper Structure (63 sections, 12 theorems, 78 equations, 63 figures, 3 tables, 1 algorithm)

This paper contains 63 sections, 12 theorems, 78 equations, 63 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Consider the minimization problem $\min_x \phi(x)$ when $\phi(x)$ is defined in eq:definition of phi with $L_f$-smooth and $\mu_f$-strongly convex loss functions $f_{i,j}(x)$. For large enough numbers of inner iterations of orders $L=\Omega (K )$ and $K=\Omega (T )$, see the supplementary copy for t where learning rates should satisfy $\beta \leq \frac{\mu_{\tilde{F}}}{4\gamma}$, $\eta_i\leq \frac

Figures (63)

  • Figure 1: Federated learning work-flow
  • Figure 2: Convergence of PerMFL with multi-tier SOTA in strongly convex and non-convex settings on FMNIST
  • Figure 3: Effect of $\beta$, $\gamma$, and $\lambda$ on convergence of PerMFL(PM) in non-convex(CNN) and strongly convex(MCLR) settings using MNIST dataset
  • Figure 4: Ablation study on team and client's participation on MNIST datasets in convex settings (MCLR): \ref{['fig: mnist_ftfc_mclr_pl_main']} Full teams and devices participation, \ref{['fig: Mnist_ftpc_pl_5_main']} Full participation of 5 teams but partial participation of devices, \ref{['fig: mnist_ptfc_mclr_pl_main']} partial participation of teams but full participation of devices, and \ref{['fig: mnist_ptpc_team_2_mclr_pl_main']} Partial participation of teams (2%) with partial participation of clients
  • Figure 5: Effect of $\beta$ on convergence of PerMFL in non-convex settings (CNN) using MNIST dataset
  • ...and 58 more figures

Theorems & Definitions (28)

  • Remark 1
  • Theorem 1: Strongly convex
  • Theorem 2: Non-convex
  • Remark 2
  • Definition 1: Strong convexity
  • Definition 2: Smoothness
  • Definition 3: Expected Smoothness
  • Definition 4: Moreau Envelope
  • Proposition 1
  • Proposition 2
  • ...and 18 more