Your Data, My Model: Learning Who Really Helps in Federated Learning
Shamsiiat Abdurakhmanova, Amirhossein Mohammadi, Yasmin SarcheshmehPour, Alexander Jung
TL;DR
The paper tackles personalized federated learning under privacy constraints with heterogeneous device data by introducing PersFL, a data-driven peer-selection framework. It measures peer usefulness through a privacy-preserving update mechanism: a gradient-step utility for parametric models and a proximal, regularized update for non-parametric or model-agnostic settings, allowing collaboration without sharing raw data. Key contributions include (i) a gradient-step based parametric PersFL, (ii) a proximal objective-based model-agnostic PersFL, (iii) online variants and non-parametric extensions, and (iv) extensive empirical validation against IFCA and Oracle baselines on synthetic tasks and decision-tree experiments. The approach eliminates the need for a global similarity graph or known cluster counts, enabling scalable, privacy-preserving, personalized models in highly heterogeneous FL environments with practical impact for real-world distributed learning.
Abstract
Many important machine learning applications involve networks of devices-such as wearables or smartphones-that generate local data and train personalized models. A key challenge is determining which peers are most beneficial for collaboration. We propose a simple and privacy-preserving method to select relevant collaborators by evaluating how much a model improves after a single gradient step using another devices data-without sharing raw data. This method naturally extends to non-parametric models by replacing the gradient step with a non-parametric generalization. Our approach enables model-agnostic, data-driven peer selection for personalized federated learning (PersFL).
