FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated Learning

Jianqing Zhang; Yang Liu; Yang Hua; Jian Cao

FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated Learning

Jianqing Zhang, Yang Liu, Yang Hua, Jian Cao

TL;DR

This work tackles data-model heterogeneity in Federated Learning by shifting from model-parameter sharing to server-trained, trainable global prototypes (TGP). It introduces Adaptive-margin-Enhanced Contrastive Learning (ACL) to produce semantically meaningful and highly separable prototypes on the server, while clients use these prototypes to guide local learning. The proposed FedTGP demonstrably outperforms state-of-the-art prototype-based and KD-based HtFL methods across multiple datasets and heterogeneous settings, with strong robustness to increasing heterogeneity and larger client pools. By exchanging only compact prototypes, FedTGP maintains privacy and reduces communication, offering a practical, scalable solution for real-world heterogeneous FL deployments.

Abstract

Recently, Heterogeneous Federated Learning (HtFL) has attracted attention due to its ability to support heterogeneous models and data. To reduce the high communication cost of transmitting model parameters, a major challenge in HtFL, prototype-based HtFL methods are proposed to solely share class representatives, a.k.a, prototypes, among heterogeneous clients while maintaining the privacy of clients' models. However, these prototypes are naively aggregated into global prototypes on the server using weighted averaging, resulting in suboptimal global knowledge which negatively impacts the performance of clients. To overcome this challenge, we introduce a novel HtFL approach called FedTGP, which leverages our Adaptive-margin-enhanced Contrastive Learning (ACL) to learn Trainable Global Prototypes (TGP) on the server. By incorporating ACL, our approach enhances prototype separability while preserving semantic meaning. Extensive experiments with twelve heterogeneous models demonstrate that our FedTGP surpasses state-of-the-art methods by up to 9.08% in accuracy while maintaining the communication and privacy advantages of prototype-based HtFL. Our code is available at https://github.com/TsingZ0/FedTGP.

FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated Learning

TL;DR

Abstract

Paper Structure (31 sections, 11 equations, 11 figures, 12 tables, 1 algorithm)

This paper contains 31 sections, 11 equations, 11 figures, 12 tables, 1 algorithm.

Introduction
Related Work
Heterogeneous Federated Learning
Trainable Prototype Learning
Method
Problem Statement and Motivation
Trainable Global Prototypes
Adaptive-Margin-Enhanced Contrastive Learning
FedTGP Framework
Experiments
Setup
Performance
Impact of Model Heterogeneity
Partial Participation with More Clients
Impact of Number of Client Training Epochs
...and 16 more sections

Figures (11)

Figure 1: The illustration of the prototype margin change after generating global prototypes. The prototype margin is the minimum Euclidean distance between the prototype of a specific class and the prototypes of other classes, and the maximum margin is the maximum prototype margin among all clients for each class. To enhance visualization and eliminate the influence of magnitude, we normalize the margin values for each method in these figures. Different colors represent different classes. (a) The global prototype margin shrinks compared to the maximum of clients' prototype margins in FedProto. (b) The global prototype margin improves compared to the maximum of clients' prototype margins in our FedTGP.
Figure 2: The global and client prototypes in FedProto and our FedTGP. Different colors and numbers represent classes and clients, respectively. Circles represent the client prototypes and triangles represent the global prototypes. The black and yellow dotted arrows show the inter-class separation among the client and global prototypes, respectively. Triangles with dotted borders represent our TGP. The red arrows show the inter-class intervals between TGP and the client prototypes of other classes in our ACL.
Figure 3: An example of trainable vectors ($\{\acute{P}^c\}^{C}_{c=1}$) and the further processing model ($\theta_{F}$). They only exist on the server.
Figure 4: The training error curve on Flowers102 using the HtFE$_8$ model group in the default practical setting.
Figure 5: The t-SNE visualization of prototypes on the server on FMNIST in the practical setting using the HtCNN$_8$ model group. Different colors represent different classes. Circles represent the client prototypes and triangles represent the global prototypes. Triangles with dotted borders represent our TGP. Best viewed in color.
...and 6 more figures

FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated Learning

TL;DR

Abstract

FedTGP: Trainable Global Prototypes with Adaptive-Margin-Enhanced Contrastive Learning for Data and Model Heterogeneity in Federated Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (11)