Table of Contents
Fetching ...

Federated Learning under Partially Class-Disjoint Data via Manifold Reshaping

Ziqing Fan, Jiangchao Yao, Ruipeng Zhang, Lingjuan Lyu, Ya Zhang, Yanfeng Wang

TL;DR

This work tackles federated learning under partially class-disjoint data (PCDD), where each client holds only a subset of classes, a practical but understudied setting. It introduces FedMR, a manifold-reshaping framework that combines an intra-class loss to decorrelate feature dimensions and an inter-class loss leveraging global class prototypes to enforce margins, thereby preventing dimensional collapse and space invasion during local training. Theoretical and empirical analyses show that the joint losses align local representations with a globally consistent feature space, delivering significant accuracy gains and improved communication efficiency across multiple benchmarks and a real-world medical dataset, with tractable privacy and local-burden considerations. FedMR’s modular design and its light variants offer practical applicability for robust, scalable FL in heterogeneous, partially observed data environments.

Abstract

Statistical heterogeneity severely limits the performance of federated learning (FL), motivating several explorations e.g., FedProx, MOON and FedDyn, to alleviate this problem. Despite effectiveness, their considered scenario generally requires samples from almost all classes during the local training of each client, although some covariate shifts may exist among clients. In fact, the natural case of partially class-disjoint data (PCDD), where each client contributes a few classes (instead of all classes) of samples, is practical yet underexplored. Specifically, the unique collapse and invasion characteristics of PCDD can induce the biased optimization direction in local training, which prevents the efficiency of federated learning. To address this dilemma, we propose a manifold reshaping approach called FedMR to calibrate the feature space of local training. Our FedMR adds two interplaying losses to the vanilla federated learning: one is intra-class loss to decorrelate feature dimensions for anti-collapse; and the other one is inter-class loss to guarantee the proper margin among categories in the feature expansion. We conduct extensive experiments on a range of datasets to demonstrate that our FedMR achieves much higher accuracy and better communication efficiency. Source code is available at: https://github.com/MediaBrain-SJTU/FedMR.git.

Federated Learning under Partially Class-Disjoint Data via Manifold Reshaping

TL;DR

This work tackles federated learning under partially class-disjoint data (PCDD), where each client holds only a subset of classes, a practical but understudied setting. It introduces FedMR, a manifold-reshaping framework that combines an intra-class loss to decorrelate feature dimensions and an inter-class loss leveraging global class prototypes to enforce margins, thereby preventing dimensional collapse and space invasion during local training. Theoretical and empirical analyses show that the joint losses align local representations with a globally consistent feature space, delivering significant accuracy gains and improved communication efficiency across multiple benchmarks and a real-world medical dataset, with tractable privacy and local-burden considerations. FedMR’s modular design and its light variants offer practical applicability for robust, scalable FL in heterogeneous, partially observed data environments.

Abstract

Statistical heterogeneity severely limits the performance of federated learning (FL), motivating several explorations e.g., FedProx, MOON and FedDyn, to alleviate this problem. Despite effectiveness, their considered scenario generally requires samples from almost all classes during the local training of each client, although some covariate shifts may exist among clients. In fact, the natural case of partially class-disjoint data (PCDD), where each client contributes a few classes (instead of all classes) of samples, is practical yet underexplored. Specifically, the unique collapse and invasion characteristics of PCDD can induce the biased optimization direction in local training, which prevents the efficiency of federated learning. To address this dilemma, we propose a manifold reshaping approach called FedMR to calibrate the feature space of local training. Our FedMR adds two interplaying losses to the vanilla federated learning: one is intra-class loss to decorrelate feature dimensions for anti-collapse; and the other one is inter-class loss to guarantee the proper margin among categories in the feature expansion. We conduct extensive experiments on a range of datasets to demonstrate that our FedMR achieves much higher accuracy and better communication efficiency. Source code is available at: https://github.com/MediaBrain-SJTU/FedMR.git.
Paper Structure (44 sections, 4 theorems, 35 equations, 7 figures, 15 tables, 1 algorithm)

This paper contains 44 sections, 4 theorems, 35 equations, 7 figures, 15 tables, 1 algorithm.

Key Result

Lemma 1

Assuming a covariance matrix $M\in\mathbf{R}^{d\times d}$ computed from the feature of each sample with the standard normalization, and its eigenvalues $\{\lambda_1, \lambda_2,...,\lambda_d\}$, we will have the following equality that satisfied

Figures (7)

  • Figure 1: Federated learning under partially class-disjoint data (PCDD).
  • Figure 2: An illustration about the shift of the optimization direction under PCDD. Here, we assume our client contains two classes $c_1$ and $c_2$ with one missing class $c_3$. ${\bf w}^*$ is the optimal classifier direction for $c_2$ (perpendicular to the plane $\alpha$) when all classes exist, and ${\bf w}_1^*$ is the learned classifier direction when $c_3$ is missing, which can be inferred by the decision plane $\beta$ between $c_1$ and $c_2$. As can be seen, PCDD leads to the angle shift $\theta$ in the optimization.
  • Figure 3: The framework of FedMR. On the client side, except the vanilla training with the classification loss, the manifold-reshaping parts, i.e., the intra-class loss and the inter-class loss, respectively help conduct the feature decorrelation to avoid the dimensional collapse, and leverage the global prototypes to construct the proper margin among classes to prevent the space invasion. On the server side, except the model aggregation, the global class prototypes are also the reference for missing classes participating in the local training.
  • Figure 4: The average memory consuming, computation time of local training and performance on all datasets of all baselines, FedMR and its light versions (Lite 10 and Lite 50) for accelerating.
  • Figure 5: Heat map of the data distribution on ISIC2019 dataset.
  • ...and 2 more figures

Theorems & Definitions (6)

  • Lemma 1
  • Theorem 1
  • Theorem 2
  • proof
  • Lemma 2
  • proof