Personalized Multi-tier Federated Learning

Sourasekhar Banerjee; Ali Dadras; Alp Yurtsever; Monowar Bhuyan

Personalized Multi-tier Federated Learning

Sourasekhar Banerjee, Ali Dadras, Alp Yurtsever, Monowar Bhuyan

TL;DR

This work addresses data heterogeneity in federated learning by introducing PerMFL, a personalized multi-tier framework that learns a global model, per-team models, and per-device models within a hierarchical cloud-edge structure. The method formulates a three-level optimization with squared Euclidean penalties linking device, team, and global models, and solves it via device-, team-, and server-level updates that exploit Moreau envelopes to balance personalization with collaboration. Theoretical guarantees cover both smooth strongly convex and smooth non-convex losses, yielding linear convergence under suitable conditions and sublinear convergence to first-order stationary points, with explicit parameter bounds. Empirically, PerMFL demonstrates robust performance and fast convergence across MNIST, FMNIST, EMNIST, FEMNIST, CIFAR100, and synthetic non-IID data, often outperforming state-of-the-art FL methods and showing resilience to various team formations and participation scenarios, while reducing global-communication costs through intra-team updates.

Abstract

The key challenge of personalized federated learning (PerFL) is to capture the statistical heterogeneity properties of data with inexpensive communications and gain customized performance for participating devices. To address these, we introduced personalized federated learning in multi-tier architecture (PerMFL) to obtain optimized and personalized local models when there are known team structures across devices. We provide theoretical guarantees of PerMFL, which offers linear convergence rates for smooth strongly convex problems and sub-linear convergence rates for smooth non-convex problems. We conduct numerical experiments demonstrating the robust empirical performance of PerMFL, outperforming the state-of-the-art in multiple personalized federated learning tasks.

Personalized Multi-tier Federated Learning

TL;DR

Abstract

Paper Structure (63 sections, 12 theorems, 78 equations, 63 figures, 3 tables, 1 algorithm)

This paper contains 63 sections, 12 theorems, 78 equations, 63 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Motivation.
PerMFL
Team formation
PerMFL formulation
Device-level updates.
Team-level updates.
Server-level updates.
Synthesis.
Convergence guarantees
Experiments
Results and Analysis
Performance:
Convergence:
...and 48 more sections

Key Result

Theorem 1

Consider the minimization problem $\min_x \phi(x)$ when $\phi(x)$ is defined in eq:definition of phi with $L_f$-smooth and $\mu_f$-strongly convex loss functions $f_{i,j}(x)$. For large enough numbers of inner iterations of orders $L=\Omega (K )$ and $K=\Omega (T )$, see the supplementary copy for t where learning rates should satisfy $\beta \leq \frac{\mu_{\tilde{F}}}{4\gamma}$, $\eta_i\leq \frac

Figures (63)

Figure 1: Federated learning work-flow
Figure 2: Convergence of PerMFL with multi-tier SOTA in strongly convex and non-convex settings on FMNIST
Figure 3: Effect of $\beta$, $\gamma$, and $\lambda$ on convergence of PerMFL(PM) in non-convex(CNN) and strongly convex(MCLR) settings using MNIST dataset
Figure 4: Ablation study on team and client's participation on MNIST datasets in convex settings (MCLR): \ref{['fig: mnist_ftfc_mclr_pl_main']} Full teams and devices participation, \ref{['fig: Mnist_ftpc_pl_5_main']} Full participation of 5 teams but partial participation of devices, \ref{['fig: mnist_ptfc_mclr_pl_main']} partial participation of teams but full participation of devices, and \ref{['fig: mnist_ptpc_team_2_mclr_pl_main']} Partial participation of teams (2%) with partial participation of clients
Figure 5: Effect of $\beta$ on convergence of PerMFL in non-convex settings (CNN) using MNIST dataset
...and 58 more figures

Theorems & Definitions (28)

Remark 1
Theorem 1: Strongly convex
Theorem 2: Non-convex
Remark 2
Definition 1: Strong convexity
Definition 2: Smoothness
Definition 3: Expected Smoothness
Definition 4: Moreau Envelope
Proposition 1
Proposition 2
...and 18 more

Personalized Multi-tier Federated Learning

TL;DR

Abstract

Personalized Multi-tier Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (63)

Theorems & Definitions (28)