Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

Jean-Baptiste Fermanian; Batiste Le Bars; Aurélien Bellet

Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

Jean-Baptiste Fermanian, Batiste Le Bars, Aurélien Bellet

TL;DR

A new PFL approach in which each agent optimizes a weighted combination of all agents'empirical risks, with the weights learned from data rather than specified a priori, yields a fully adaptive procedure that requires no prior knowledge of data heterogeneity and can automatically transition between global and local learning regimes.

Abstract

Personalized Federated Learning (PFL) enables a collection of agents to collaboratively learn individual models without sharing raw data. We propose a new PFL approach in which each agent optimizes a weighted combination of all agents' empirical risks, with the weights learned from data rather than specified a priori. The novelty of our method lies in formulating the estimation of these collaborative weights as a kernel mean embedding estimation problem with multiple data sources, leveraging tools from multi-task averaging to capture statistical relationships between agents. This perspective yields a fully adaptive procedure that requires no prior knowledge of data heterogeneity and can automatically transition between global and local learning regimes. By recasting the objective as a high-dimensional mean estimation problem, we derive finite-sample guarantees on local excess risks for a broad class of distributions, explicitly quantifying the statistical gains of collaboration. To address communication constraints inherent to federated settings, we also propose a practical implementation based on random Fourier features, which allows one to trade communication cost for statistical efficiency. Numerical experiments validate our theoretical results.

Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

TL;DR

Abstract

Paper Structure (37 sections, 14 theorems, 98 equations, 4 figures, 2 tables, 3 algorithms)

This paper contains 37 sections, 14 theorems, 98 equations, 4 figures, 2 tables, 3 algorithms.

Introduction
Preliminaries
Setting and Objective
RKHS and KME
Random Fourier Features
Related Work
Personalized Learning as High-Dimensional Mean Estimation
Controlling Generalization with MMD
Learning the Mixture Weights by Q-Aggregation
Controlling the Excess Risk of the Estimator
Practical Federated Algorithm
Random Fourier Features Approximation
Choice of Kernel
Experiments
Synthetic Concept Shift
...and 22 more sections

Key Result

Lemma 4.3

Under Assumption ass:lossinrkhs, for any learned weights $\widehat{\omega}$, we have: where $R_\Theta = \sup_{\theta} \IfEqCase{a}{ {a}{\mathopen{}\mathclose{\left\lVert h_\theta\right\rVert}} {0}{\lVert h_\theta\rVert} {1}{\lVert h_\theta\rVert} {2}{\lVert h_\theta\rVert} {3}{\lVert h_\theta\rVert} {4}{\lVert h_\theta\rVert} }[] _{\mathcal{H}}$. Moreover, if for some $r>0$, $\IfEqCa where $\Sigm

Figures (4)

Figure 1: Mean Squared Error and its standard deviation of different approaches in function of the intra-group noise $\sigma^2_c$.
Figure 2: Synthetic concept shift. Left side: test MSE in function of the architecture (lower is better). Right size: learned weights.
Figure 3: FEMNIST. Accuracy of each agent for each method sorted in function of the Q-aggregation ones and a boxplot of these accuracies other the agents (higher is better)
Figure 4: Number of train and test points for each agent.

Theorems & Definitions (28)

Remark 2.1: Optimization error
Example 4.2: Linear regression
Lemma 4.3
Theorem 4.4
Example 4.5: Identical agents
Corollary 4.6
Example 5.1: Linear regression, continued
Theorem 5.2
Proposition C.2
Lemma D.1
...and 18 more

Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

TL;DR

Abstract

Adaptive Personalized Federated Learning via Multi-task Averaging of Kernel Mean Embeddings

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (28)