FedSKD: Aggregation-free Model-heterogeneous Federated Learning using Multi-dimensional Similarity Knowledge Distillation
Ziqiao Weng, Weidong Cai, Bo Zhou
TL;DR
FedSKD tackles model-heterogeneous federated learning in privacy-sensitive medical contexts by removing centralized aggregation and enabling round-robin exchange of heterogeneous models. Its core is a bidirectional knowledge transfer framework built on multi-dimensional similarity knowledge distillation, aligning batch-wise, pixel/voxel-wise, and region-wise representations to prevent model drift and knowledge dilution. Across ASD (ABIDE-derived FedASD) and skin lesion (Derm7pt-derived FedSkin) tasks with non-IID partitions, FedSKD consistently outperforms aggregation-based and baseline P2P methods in both client-specific personalization and cross-institution generalization. The results underscore FedSKD’s potential as a scalable, robust solution for realistic medical FL deployments, while highlighting avenues for efficiency, security, and broader task applicability.
Abstract
Federated learning (FL) enables privacy-preserving collaborative model training without direct data sharing. Model-heterogeneous FL (MHFL) extends this paradigm by allowing clients to train personalized models with heterogeneous architectures tailored to their computational resources and application-specific needs. However, existing MHFL methods predominantly rely on centralized aggregation, which introduces scalability and efficiency bottlenecks, or impose restrictions requiring partially identical model architectures across clients. While peer-to-peer (P2P) FL removes server dependence, it suffers from model drift and knowledge dilution, limiting its effectiveness in heterogeneous settings. To address these challenges, we propose FedSKD, a novel MHFL framework that facilitates direct knowledge exchange through round-robin model circulation, eliminating the need for centralized aggregation while allowing fully heterogeneous model architectures across clients. FedSKD's key innovation lies in multi-dimensional similarity knowledge distillation, which enables bidirectional cross-client knowledge transfer at batch, pixel/voxel, and region levels for heterogeneous models in FL. This approach mitigates catastrophic forgetting and model drift through progressive reinforcement and distribution alignment while preserving model heterogeneity. Extensive evaluations on fMRI-based autism spectrum disorder diagnosis and skin lesion classification demonstrate that FedSKD outperforms state-of-the-art heterogeneous and homogeneous FL baselines, achieving superior personalization (client-specific accuracy) and generalization (cross-institutional adaptability). These findings underscore FedSKD's potential as a scalable and robust solution for real-world medical federated learning applications.
