FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Liping Yi; Han Yu; Zhuan Shi; Gang Wang; Xiaoguang Liu; Lizhen Cui; Xiaoxiao Li

FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Liping Yi, Han Yu, Zhuan Shi, Gang Wang, Xiaoguang Liu, Lizhen Cui, Xiaoxiao Li

TL;DR

This work tackles privacy-preserving federated learning under data and system heterogeneity by introducing FedSSA, which splits client models into a heterogeneous feature extractor and a homogeneous header. It enables local-to-global knowledge transfer through per-class header aggregation based on semantic similarity and stabilizes global-to-local updates via an adaptive header fusion strategy that leverages historical and current global information. Theoretical convergence is established for the non-convex setting, and extensive experiments on CIFAR-10/100 show FedSSA achieving higher accuracy and better efficiency (communication and computation) than seven strong MHPFL baselines, without requiring public data. The approach yields robust personalization across non-IID scenarios and varying client participation, highlighting practical impact for heterogeneous FL deployments.

Abstract

Federated learning (FL) is a privacy-preserving collaboratively machine learning paradigm. Traditional FL requires all data owners (a.k.a. FL clients) to train the same local model. This design is not well-suited for scenarios involving data and/or system heterogeneity. Model-Heterogeneous Personalized FL (MHPFL) has emerged to address this challenge. Existing MHPFL approaches often rely on a public dataset with the same nature as the learning task, or incur high computation and communication costs. To address these limitations, we propose the Federated Semantic Similarity Aggregation (FedSSA) approach for supervised classification tasks, which splits each client's model into a heterogeneous (structure-different) feature extractor and a homogeneous (structure-same) classification header. It performs local-to-global knowledge transfer via semantic similarity-based header parameter aggregation. In addition, global-to-local knowledge transfer is achieved via an adaptive parameter stabilization strategy which fuses the seen-class parameters of historical local headers with that of the latest global header for each client. FedSSA does not rely on public datasets, while only requiring partial header parameter transmission to save costs. Theoretical analysis proves the convergence of FedSSA. Extensive experiments present that FedSSA achieves up to 3.62% higher accuracy, 15.54 times higher communication efficiency, and 15.52 times higher computational efficiency compared to 7 state-of-the-art MHPFL baselines.

FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

TL;DR

Abstract

Paper Structure (26 sections, 4 theorems, 25 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 26 sections, 4 theorems, 25 equations, 11 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Preliminaries
Notations and Objective of Typical FL
Problem Definition
The Proposed FedSSA Approach
Semantic Similarity-Based Aggregation
Adaptive Parameter Stabilization
Discussion
Analysis
Experimental Evaluation
Experiment Setup
Comparisons Results
Model-Homogeneous PFL
Model-Heterogeneous PFL
...and 11 more sections

Key Result

Lemma 1

Based on Assumptions assump:Lipschitz and assump:Unbiased, during $\{0,1,...,E\}$ iterations of the $(t+1)$-th FL training round, the loss of an arbitrary client's local model is bounded by:

Figures (11)

Figure 1: Feature extractor and classification header.
Figure 2: The FedSSA framework.
Figure 3: Decay functions for $\mu^t$. $\cos (\cdot)\in[0,\pi/2]$ is a smooth decay function. It leads to a stable decrease of $\mu$.
Figure 4: Accuracy distribution for individual clients.
Figure 5: Trade-off among test accuracy, communication cost, and computational overhead. The sizes of each marker (dots of varying sizes) reflect the corresponding computation FLOPs (1e12).
...and 6 more figures

Theorems & Definitions (7)

Lemma 1
Lemma 2
Theorem 1
Theorem 2
proof
proof
proof

FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

TL;DR

Abstract

FedSSA: Semantic Similarity-based Aggregation for Efficient Model-Heterogeneous Personalized Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (7)