Client-supervised Federated Learning: Towards One-model-for-all Personalization

Peng Yan; Guodong Long

Client-supervised Federated Learning: Towards One-model-for-all Personalization

Peng Yan, Guodong Long

TL;DR

The paper tackles personalization in federated learning under non-IID data by proposing FedCS, a one-model-for-all approach that learns a single robust global model capable of personalized predictions without on-device fine-tuning. FedCS introduces a Representation Alignment (RA) module to embed client bias in a latent space, enabling client-specific outcomes while sharing cross-client knowledge; this is implemented via a bi-level optimization that aligns representations by solving an LDA-like objective, $J(P)=\mathrm{Tr}(\Sigma_W^{-1}\Sigma_B)$, with $\Sigma_W$ and $\Sigma_B$ representing within- and between-client scatter, respectively. The training uses a client-supervised, incremental update of $S=\Sigma_W^{-1/2}$ and $\Phi$ to approximate the eigenstructure, allowing private, distributed updates in FL. Experiments on label-shift and feature-shift scenarios show FedCS achieves competitive or superior performance to personalized FL methods that require adaptation, with strong robustness on unseen test clients and direct deployability without local fine-tuning.

Abstract

Personalized Federated Learning (PerFL) is a new machine learning paradigm that delivers personalized models for diverse clients under federated learning settings. Most PerFL methods require extra learning processes on a client to adapt a globally shared model to the client-specific personalized model using its own local data. However, the model adaptation process in PerFL is still an open challenge in the stage of model deployment and test time. This work tackles the challenge by proposing a novel federated learning framework to learn only one robust global model to achieve competitive performance to those personalized models on unseen/test clients in the FL system. Specifically, we design a new Client-Supervised Federated Learning (FedCS) to unravel clients' bias on instances' latent representations so that the global model can learn both client-specific and client-agnostic knowledge. Experimental study shows that the FedCS can learn a robust FL global model for the changing data distributions of unseen/test clients. The FedCS's global model can be directly deployed to the test clients while achieving comparable performance to other personalized FL methods that require model adaptation.

Client-supervised Federated Learning: Towards One-model-for-all Personalization

TL;DR

, with

and

representing within- and between-client scatter, respectively. The training uses a client-supervised, incremental update of

and

to approximate the eigenstructure, allowing private, distributed updates in FL. Experiments on label-shift and feature-shift scenarios show FedCS achieves competitive or superior performance to personalized FL methods that require adaptation, with strong robustness on unseen test clients and direct deployability without local fine-tuning.

Abstract

Paper Structure (25 sections, 13 equations, 10 figures, 3 tables, 2 algorithms)

This paper contains 25 sections, 13 equations, 10 figures, 3 tables, 2 algorithms.

Introduction
Related Work
Problem Formulation
Methodology
Representation Alignment
Client Supervised Optimization
Experiments
Client Settings
Models and Hyperparameters
Performance
Label-shift Settings
Overall Performance
Group-wised Performance
Feature-shift Settings
Conclusion
...and 10 more sections

Figures (10)

Figure 1: Illustration to the FedCS. (b) and (c) describe distributions of instances' latent representations before and after the Representation Alignment module (RA Module). Types of markers denote classes, and colors indicate clients they are observed. The vanilla feature extractor is unable to recognize clients' bias. RA module in FedCS will align the hidden space so that latent representations can indicate biases of clients.
Figure 2: Part of experiments on CIFAR-10. The horizontal axis denotes communication rounds, and the vertical axis denotes F1 Scores. We demonstrate the averaged weighted F1-scores within each client group and marked them with different colours. We can find the performance on different distributions (client groups) vary significantly through FedAvg+FT and FedBN. FedCS has the most robust performance even on unseen clients (group 8 and 9)
Figure 3: client settings for label shift experiments
Figure 4: Proportion of instances of different classes. Different classes are marked with different colors. Horizontal axix denotes client ids, and the vertical axis denotes the proportion of classes
Figure 5: client settings for feature shift experiments
...and 5 more figures

Client-supervised Federated Learning: Towards One-model-for-all Personalization

TL;DR

Abstract

Client-supervised Federated Learning: Towards One-model-for-all Personalization

Authors

TL;DR

Abstract

Table of Contents

Figures (10)