A Survey on Heterogeneous Federated Learning
Dashan Gao, Xin Yao, Qiang Yang
TL;DR
The paper surveys heterogeneous federated learning across data-space, statistical, and system dimensions, and introduces a taxonomy that distinguishes data-space homogeneous vs heterogeneous FL and transfers (VFL, Hetero-FTL, Homo-FTL). It consolidates methods to address heterogeneity via transfer learning (representation learning, distillation, augmentation, collaborative filtering), analyzes privacy and security concerns (MPC, HE, DP, TEE), and reviews applications in recommender systems, finance, and healthcare. The authors highlight that data-space heterogeneity, especially Hetero-FTL, remains under-explored and propose future directions focusing on framework design, adaptability to partial alignment, efficiency, and trustworthy, privacy-preserving solutions. Overall, the work provides a comprehensive roadmap for researchers and practitioners aiming to build robust, privacy-preserving heterogeneous FL systems across industries.
Abstract
Federated learning (FL) has been proposed to protect data privacy and virtually assemble the isolated data silos by cooperatively training models among organizations without breaching privacy and security. However, FL faces heterogeneity from various aspects, including data space, statistical, and system heterogeneity. For example, collaborative organizations without conflict of interest often come from different areas and have heterogeneous data from different feature spaces. Participants may also want to train heterogeneous personalized local models due to non-IID and imbalanced data distribution and various resource-constrained devices. Therefore, heterogeneous FL is proposed to address the problem of heterogeneity in FL. In this survey, we comprehensively investigate the domain of heterogeneous FL in terms of data space, statistical, system, and model heterogeneity. We first give an overview of FL, including its definition and categorization. Then, We propose a precise taxonomy of heterogeneous FL settings for each type of heterogeneity according to the problem setting and learning objective. We also investigate the transfer learning methodologies to tackle the heterogeneity in FL. We further present the applications of heterogeneous FL. Finally, we highlight the challenges and opportunities and envision promising future research directions toward new framework design and trustworthy approaches.
