Table of Contents
Fetching ...

DualFed: Enjoying both Generalization and Personalization in Federated Learning via Hierachical Representations

Guogang Zhu, Xuefeng Liu, Jianwei Niu, Shaojie Tang, Xinghao Wu, Jiayuan Zhang

TL;DR

This work tackles the competing demands of generalization and personalization in non-IID federated learning. It introduces DualFed, a framework that inserts a personalized projection network between the encoder and classifier to split generalized (pre-projection) and personalized (post-projection) representations, with a shared encoder, a global classifier for generalized signals, and a personalized classifier for local signals; final predictions are an ensemble of both pathways. The approach enables stage-wise local training and simple server-side aggregation, reducing mutual interference between objectives and delivering superior performance across PACS, DomainNet, and Office-Home compared with strong baselines. Empirical results, ablations, and analyses demonstrate that separating representation stages preserves generalization while enabling personalization, with favorable communication efficiency and robust behavior across diverse domain shifts. The work provides practical insights for designing FL systems that harness hierarchical representations to achieve both broad generalization and client-specific adaptation, and it releases code for reproducibility.

Abstract

In personalized federated learning (PFL), it is widely recognized that achieving both high model generalization and effective personalization poses a significant challenge due to their conflicting nature. As a result, existing PFL methods can only manage a trade-off between these two objectives. This raises an interesting question: Is it feasible to develop a model capable of achieving both objectives simultaneously? Our paper presents an affirmative answer, and the key lies in the observation that deep models inherently exhibit hierarchical architectures, which produce representations with various levels of generalization and personalization at different stages. A straightforward approach stemming from this observation is to select multiple representations from these layers and combine them to concurrently achieve generalization and personalization. However, the number of candidate representations is commonly huge, which makes this method infeasible due to high computational costs.To address this problem, we propose DualFed, a new method that can directly yield dual representations correspond to generalization and personalization respectively, thereby simplifying the optimization task. Specifically, DualFed inserts a personalized projection network between the encoder and classifier. The pre-projection representations are able to capture generalized information shareable across clients, and the post-projection representations are effective to capture task-specific information on local clients. This design minimizes the mutual interference between generalization and personalization, thereby achieving a win-win situation. Extensive experiments show that DualFed can outperform other FL methods. Code is available at https://github.com/GuogangZhu/DualFed.

DualFed: Enjoying both Generalization and Personalization in Federated Learning via Hierachical Representations

TL;DR

This work tackles the competing demands of generalization and personalization in non-IID federated learning. It introduces DualFed, a framework that inserts a personalized projection network between the encoder and classifier to split generalized (pre-projection) and personalized (post-projection) representations, with a shared encoder, a global classifier for generalized signals, and a personalized classifier for local signals; final predictions are an ensemble of both pathways. The approach enables stage-wise local training and simple server-side aggregation, reducing mutual interference between objectives and delivering superior performance across PACS, DomainNet, and Office-Home compared with strong baselines. Empirical results, ablations, and analyses demonstrate that separating representation stages preserves generalization while enabling personalization, with favorable communication efficiency and robust behavior across diverse domain shifts. The work provides practical insights for designing FL systems that harness hierarchical representations to achieve both broad generalization and client-specific adaptation, and it releases code for reproducibility.

Abstract

In personalized federated learning (PFL), it is widely recognized that achieving both high model generalization and effective personalization poses a significant challenge due to their conflicting nature. As a result, existing PFL methods can only manage a trade-off between these two objectives. This raises an interesting question: Is it feasible to develop a model capable of achieving both objectives simultaneously? Our paper presents an affirmative answer, and the key lies in the observation that deep models inherently exhibit hierarchical architectures, which produce representations with various levels of generalization and personalization at different stages. A straightforward approach stemming from this observation is to select multiple representations from these layers and combine them to concurrently achieve generalization and personalization. However, the number of candidate representations is commonly huge, which makes this method infeasible due to high computational costs.To address this problem, we propose DualFed, a new method that can directly yield dual representations correspond to generalization and personalization respectively, thereby simplifying the optimization task. Specifically, DualFed inserts a personalized projection network between the encoder and classifier. The pre-projection representations are able to capture generalized information shareable across clients, and the post-projection representations are effective to capture task-specific information on local clients. This design minimizes the mutual interference between generalization and personalization, thereby achieving a win-win situation. Extensive experiments show that DualFed can outperform other FL methods. Code is available at https://github.com/GuogangZhu/DualFed.
Paper Structure (16 sections, 11 equations, 9 figures, 8 tables)

This paper contains 16 sections, 11 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: Different forms that combines the representations and the classifier. (a) Global encoder with personalized classifier, (b) Personalized classifier with global encoder, (c) Our proposed DualFed that utilizes hierachical representations.
  • Figure 2: Framework overview of DualFed. It consists of 4 steps in a single global round: 1) the server broadcasts global encoder and classifier to each client; 2) each client performs local updating by iteratively updaing main branch and global classifier; 3) each client uploads its updated global encoder and classifier to the server; 4) the server aggregates encoders and classifiers from clients to generate new ones.
  • Figure 3: Test accuracy during training on DomainNet.
  • Figure 4: Visualization of representations on DomainNet.
  • Figure 5: Class-wise separation during training.
  • ...and 4 more figures