Neural Collapse-Inspired Multi-Label Federated Learning under Label-Distribution Skew
Can Peng, Yuyuan Liu, Yingyu Yang, Pramit Saha, Qianye Yang, J. Alison Noble
TL;DR
This work tackles multi-label federated learning under label skew by introducing FedNCA-ML, which enforces Neural Collapse–inspired geometry to align feature distributions across non-IID clients. It combines a Label-Aware Disentanglement Module that extracts per-class features with a fixed Equiangular Tight Frame classifier to enforce a global simplex ETF structure, along with two regularizers that suppress noisy negatives and promote compact intra-class clustering. The approach demonstrates strong class-wise improvements across four datasets in eight non-IID settings, achieving up to $3.92\%$ in macro-AUC and $4.93\%$ in macro-F1 on average, highlighting improved robustness and generalization in challenging multi-label FL scenarios. The framework offers a principled path to preserve global label geometry in heterogeneous FL, with practical relevance to medical imaging and other multi-label domains where label distribution skew is prevalent.
Abstract
Federated Learning (FL) enables collaborative model training across distributed clients while preserving data privacy, yet it remains challenging as data distributions can be highly heterogeneous. These challenges are further amplified in multi-label scenarios, where data exhibit characteristics such as label co-occurrence, inter-label dependency, and discrepancies between local and global label relationships. While most existing FL studies focus on single-label classification, real-world applications, such as in medical imaging, involve multi-label data with highly skewed label distributions across clients. To address this important yet underexplored problem, we propose FedNCA-ML, a novel FL framework that aligns feature distributions across clients and learns discriminative, well-clustered representations inspired by Neural Collapse (NC) theory. NC describes an ideal latent-space geometry where each class's features collapse to their mean, forming a maximally separated simplex. To extend this theory to multi-label settings, we introduce a feature disentanglement module that extracts class-specific representations. The clustering of these disentangled features is guided by a shared NC-inspired structure, mitigating conflicts among client models caused by heterogeneous local data. Furthermore, we design regularisation losses to encourage compact and consistent feature clustering in the latent space. Experiments on four benchmark datasets under eight FL settings demonstrate the effectiveness of the proposed method, achieving improvements of up to 3.92% in class-wise AUC and 4.93% in class-wise F1 score.
