Your Data, My Model: Learning Who Really Helps in Federated Learning

Shamsiiat Abdurakhmanova; Amirhossein Mohammadi; Yasmin SarcheshmehPour; Alexander Jung

Your Data, My Model: Learning Who Really Helps in Federated Learning

Shamsiiat Abdurakhmanova, Amirhossein Mohammadi, Yasmin SarcheshmehPour, Alexander Jung

TL;DR

The paper tackles personalized federated learning under privacy constraints with heterogeneous device data by introducing PersFL, a data-driven peer-selection framework. It measures peer usefulness through a privacy-preserving update mechanism: a gradient-step utility for parametric models and a proximal, regularized update for non-parametric or model-agnostic settings, allowing collaboration without sharing raw data. Key contributions include (i) a gradient-step based parametric PersFL, (ii) a proximal objective-based model-agnostic PersFL, (iii) online variants and non-parametric extensions, and (iv) extensive empirical validation against IFCA and Oracle baselines on synthetic tasks and decision-tree experiments. The approach eliminates the need for a global similarity graph or known cluster counts, enabling scalable, privacy-preserving, personalized models in highly heterogeneous FL environments with practical impact for real-world distributed learning.

Abstract

Many important machine learning applications involve networks of devices-such as wearables or smartphones-that generate local data and train personalized models. A key challenge is determining which peers are most beneficial for collaboration. We propose a simple and privacy-preserving method to select relevant collaborators by evaluating how much a model improves after a single gradient step using another devices data-without sharing raw data. This method naturally extends to non-parametric models by replacing the gradient step with a non-parametric generalization. Our approach enables model-agnostic, data-driven peer selection for personalized federated learning (PersFL).

Your Data, My Model: Learning Who Really Helps in Federated Learning

TL;DR

Abstract

Paper Structure (18 sections, 16 equations, 8 figures, 2 algorithms)

This paper contains 18 sections, 16 equations, 8 figures, 2 algorithms.

Introduction
Our Contribution
Related Work
Outline
Problem Setting
Methods
Parametric PersFL
Model-Agnostic PersFL
Online Variants
Numerical Experiments
Toy Dataset
Training a Personalized Linear Model
Comparison with IFCA
Comparison with Oracle Method
Online Learning of Personalized Linear Model
...and 3 more sections

Figures (8)

Figure 1: Personalized model training at device $i=1$ via privacy-preserving access to other devices' datasets.
Figure 2: Algorithm \ref{['alg_pfl_regretmin_modelagnostic']} uses a generalized gradient step \ref{['equ_def_update_modelagnostic']} to update a non-parametric model $\widehat{h} \in \mathcal{H}$. This update can be interpreted as a form of regularization via data augmentation.
Figure 3: MSE incurred by Algorithm \ref{['alg_pfl_regretmin']} for varying model and algorithm hyper-parameters: (a) varying $d/m_{}$, (b) varying $\sigma$, and (c) varying $S$. The source code for the experiment can be found at ShamsiRepo.
Figure 4: MSE incurred by Algorithm \ref{['alg_pfl_regretmin']} and IFCA for local datasets forming $k=2$ clusters (see \ref{['equ_def_cluster_partition']}). IFCA is provided the correct number of clusters.
Figure 5: MSE incurred by Algorithm \ref{['alg_pfl_regretmin']} and IFCA for local datasets forming $k=5$ clusters (see \ref{['equ_def_cluster_partition']}). In contrast to Figure \ref{['fig:ifca']}, here IFCA is run with a miss-specified number of clusters ($2$ clusters).
...and 3 more figures

Your Data, My Model: Learning Who Really Helps in Federated Learning

TL;DR

Abstract

Your Data, My Model: Learning Who Really Helps in Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (8)