FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning
Liping Yi, Han Yu, Zhuan Shi, Gang Wang, Xiaoguang Liu, Lizhen Cui, Xiaoxiao Li
TL;DR
This work tackles privacy-preserving federated learning under data and system heterogeneity by introducing FedSSA, which splits client models into a heterogeneous feature extractor and a homogeneous header. It enables local-to-global knowledge transfer through per-class header aggregation based on semantic similarity and stabilizes global-to-local updates via an adaptive header fusion strategy that leverages historical and current global information. Theoretical convergence is established for the non-convex setting, and extensive experiments on CIFAR-10/100 show FedSSA achieving higher accuracy and better efficiency (communication and computation) than seven strong MHPFL baselines, without requiring public data. The approach yields robust personalization across non-IID scenarios and varying client participation, highlighting practical impact for heterogeneous FL deployments.
Abstract
Federated learning (FL) is a privacy-preserving collaboratively machine learning paradigm. Traditional FL requires all data owners (a.k.a. FL clients) to train the same local model. This design is not well-suited for scenarios involving data and/or system heterogeneity. Model-Heterogeneous Personalized FL (MHPFL) has emerged to address this challenge. Existing MHPFL approaches often rely on a public dataset with the same nature as the learning task, or incur high computation and communication costs. To address these limitations, we propose the Federated Semantic Similarity Aggregation (FedSSA) approach for supervised classification tasks, which splits each client's model into a heterogeneous (structure-different) feature extractor and a homogeneous (structure-same) classification header. It performs local-to-global knowledge transfer via semantic similarity-based header parameter aggregation. In addition, global-to-local knowledge transfer is achieved via an adaptive parameter stabilization strategy which fuses the seen-class parameters of historical local headers with that of the latest global header for each client. FedSSA does not rely on public datasets, while only requiring partial header parameter transmission to save costs. Theoretical analysis proves the convergence of FedSSA. Extensive experiments present that FedSSA achieves up to 3.62% higher accuracy, 15.54 times higher communication efficiency, and 15.52 times higher computational efficiency compared to 7 state-of-the-art MHPFL baselines.
