Table of Contents
Fetching ...

FedMT: Federated Learning with Mixed-type Labels

Qiong Zhang, Jing Peng, Xin Zhang, Aline Talhouk, Gang Niu, Xiaoxiao Li

TL;DR

This work tackles federated learning under mixed-type labels, where centers use different labeling criteria and label spaces do not align. It introduces FedMT, a model-agnostic framework that learns a label-space correspondence ${\mathbb{M}}$ to project client outputs into the server's space, enabling end-to-end FL training without data pooling. When ${\mathbb{M}}$ is unknown, FedMT estimates it from high-confidence pseudo-labels, and enhances robustness with strong data augmentation; the authors also derive generalization bounds for the cross-space setting. Empirically, FedMT yields substantial accuracy improvements over baselines on CIFAR100 and a dermatology dataset, validating its practicality for cross-center collaboration in domains like medicine while preserving privacy and maintaining communication efficiency.

Abstract

In federated learning (FL), classifiers (e.g., deep networks) are trained on datasets from multiple data centers without exchanging data across them, which improves the sample efficiency. However, the conventional FL setting assumes the same labeling criterion in all data centers involved, thus limiting its practical utility. This limitation becomes particularly notable in domains like disease diagnosis, where different clinical centers may adhere to different standards, making traditional FL methods unsuitable. This paper addresses this important yet under-explored setting of FL, namely FL with mixed-type labels, where the allowance of different labeling criteria introduces inter-center label space differences. To address this challenge effectively and efficiently, we introduce a model-agnostic approach called FedMT, which estimates label space correspondences and projects classification scores to construct loss functions. The proposed FedMT is versatile and integrates seamlessly with various FL methods, such as FedAvg. Experimental results on benchmark and medical datasets highlight the substantial improvement in classification accuracy achieved by FedMT in the presence of mixed-type labels.

FedMT: Federated Learning with Mixed-type Labels

TL;DR

This work tackles federated learning under mixed-type labels, where centers use different labeling criteria and label spaces do not align. It introduces FedMT, a model-agnostic framework that learns a label-space correspondence to project client outputs into the server's space, enabling end-to-end FL training without data pooling. When is unknown, FedMT estimates it from high-confidence pseudo-labels, and enhances robustness with strong data augmentation; the authors also derive generalization bounds for the cross-space setting. Empirically, FedMT yields substantial accuracy improvements over baselines on CIFAR100 and a dermatology dataset, validating its practicality for cross-center collaboration in domains like medicine while preserving privacy and maintaining communication efficiency.

Abstract

In federated learning (FL), classifiers (e.g., deep networks) are trained on datasets from multiple data centers without exchanging data across them, which improves the sample efficiency. However, the conventional FL setting assumes the same labeling criterion in all data centers involved, thus limiting its practical utility. This limitation becomes particularly notable in domains like disease diagnosis, where different clinical centers may adhere to different standards, making traditional FL methods unsuitable. This paper addresses this important yet under-explored setting of FL, namely FL with mixed-type labels, where the allowance of different labeling criteria introduces inter-center label space differences. To address this challenge effectively and efficiently, we introduce a model-agnostic approach called FedMT, which estimates label space correspondences and projects classification scores to construct loss functions. The proposed FedMT is versatile and integrates seamlessly with various FL methods, such as FedAvg. Experimental results on benchmark and medical datasets highlight the substantial improvement in classification accuracy achieved by FedMT in the presence of mixed-type labels.
Paper Structure (27 sections, 4 theorems, 43 equations, 3 figures, 9 tables, 1 algorithm)

This paper contains 27 sections, 4 theorems, 43 equations, 3 figures, 9 tables, 1 algorithm.

Key Result

Theorem 3.1

Let ${\mathcal{D}}_{s} = {({\bm{x}}_i^s, y_i^s): i=1,\ldots, n}$ and ${\mathcal{D}}_{c} = {({\bm{x}}_i^c, \widetilde{y}_i^c): i=1,\ldots, m}$ be two datasets, and ${\bm{g}}: {\mathcal{X}} \to \Delta_{K-1}$ be any classifier based on the combined dataset ${\mathcal{D}}_{s} \cup {\mathcal{D}}_{c}$. Le

Figures (3)

  • Figure 1: Illustration of the problem setting and our proposed FedMT method. (a) We consider different label spaces ( i.e., desired label space ${\mathcal{Y}}$ with $K$ classes and the other space $\widetilde{{\mathcal{Y}}}$ with $J$ classes) where classes may overlap, such as $\widetilde{Y}^1$ and $Y^2$. Annotation within the desired label space is typically more challenging and resource-intensive, resulting in a scarcity of labeled samples. (b) We employ a fixed label space correspondence matrix ${\mathbb{M}}$ to establish associations between label spaces, effectively linking $\widetilde{{\mathcal{Y}}}$ and ${\mathcal{Y}}$. Our method, denoted as FedMT (T), locally corrects class scores $f$ using ${\mathbb{M}}$ within the FedAvg framework. In instances where the correspondence matrix ${\mathbb{M}}$ is unknown, we propose a pseudo-label based method to estimate $\widehat{{\mathbb{M}}}$. Subsequently, FedMT (E) incorporates $\widehat{{\mathbb{M}}}$ into the loss function to correct class scores.
  • Figure 2: Empirical analysis of FedMT under various settings. Comparison with baselines is presented under different scenarios: (a) super-class prediction accuracy with varying numbers of sub-class samples, (b) in a non-iid setting, (c) with varying numbers of clients, and (d) with diverse backbone architectures.
  • Figure 3: Estimation error of ${\mathbb{M}}$. The comparison of the Frobenius norm between $\widehat{{\mathbb{M}}}$ and true ${\mathbb{M}}$ via two approaches ( i.e., FedMT and learnable) and different client sample sizes ($N_c$).

Theorems & Definitions (9)

  • Theorem 3.1: Informal
  • Lemma B.1: Hoeffding's inequality
  • Lemma B.2
  • proof : Proof of Lemma \ref{['lemma:difference_bound']}
  • Lemma B.3: Estimation error of $\widehat{{\mathbb{S}}}$
  • proof : Proof of Lemma \ref{['lemma:hatm_error']}
  • proof : Proof of \ref{['eq:coarse_bound']}
  • proof : Proof of \ref{['eq:fine_bound_knownM']}
  • proof : Proof of \ref{['eq:fine_bound_estimatedM']}