FedMT: Federated Learning with Mixed-type Labels
Qiong Zhang, Jing Peng, Xin Zhang, Aline Talhouk, Gang Niu, Xiaoxiao Li
TL;DR
This work tackles federated learning under mixed-type labels, where centers use different labeling criteria and label spaces do not align. It introduces FedMT, a model-agnostic framework that learns a label-space correspondence ${\mathbb{M}}$ to project client outputs into the server's space, enabling end-to-end FL training without data pooling. When ${\mathbb{M}}$ is unknown, FedMT estimates it from high-confidence pseudo-labels, and enhances robustness with strong data augmentation; the authors also derive generalization bounds for the cross-space setting. Empirically, FedMT yields substantial accuracy improvements over baselines on CIFAR100 and a dermatology dataset, validating its practicality for cross-center collaboration in domains like medicine while preserving privacy and maintaining communication efficiency.
Abstract
In federated learning (FL), classifiers (e.g., deep networks) are trained on datasets from multiple data centers without exchanging data across them, which improves the sample efficiency. However, the conventional FL setting assumes the same labeling criterion in all data centers involved, thus limiting its practical utility. This limitation becomes particularly notable in domains like disease diagnosis, where different clinical centers may adhere to different standards, making traditional FL methods unsuitable. This paper addresses this important yet under-explored setting of FL, namely FL with mixed-type labels, where the allowance of different labeling criteria introduces inter-center label space differences. To address this challenge effectively and efficiently, we introduce a model-agnostic approach called FedMT, which estimates label space correspondences and projects classification scores to construct loss functions. The proposed FedMT is versatile and integrates seamlessly with various FL methods, such as FedAvg. Experimental results on benchmark and medical datasets highlight the substantial improvement in classification accuracy achieved by FedMT in the presence of mixed-type labels.
