Federated Learning with Partially Labeled Data: A Conditional Distillation Approach
Pochuan Wang, Chen Shen, Masahiro Oda, Chiou-Shann Fuh, Kensaku Mori, Weichung Wang, Holger R. Roth
TL;DR
This work tackles the problem of learning generalizable multi-organ and lesion segmentation models under privacy constraints and partial labeling across institutions. It introduces ConDistFL, a framework that couples a supervised loss on locally labeled data with a label-aware conditional distillation loss, enabling knowledge transfer from a global model to clients with missing annotations without requiring extra teacher networks. Key contributions include the foreground/background grouping strategies, conditional probability weighting, and foreground filtering within the ConDist loss, applicable to both 3D CT and 2D chest X-ray modalities, along with comprehensive ablations and out-of-federation evaluation demonstrating robust, cross-domain generalization. The approach achieves state-of-the-art federated performance on in-federation and external datasets, while maintaining practical computation and communication costs, making it suitable for real-world privacy-sensitive multi-center deployments.
Abstract
In medical imaging, developing generalized segmentation models that can handle multiple organs and lesions is crucial. However, the scarcity of fully annotated datasets and strict privacy regulations present significant barriers to data sharing. Federated Learning (FL) allows decentralized model training, but existing FL methods often struggle with partial labeling, leading to model divergence and catastrophic forgetting. We propose ConDistFL, a novel FL framework incorporating conditional distillation to address these challenges. ConDistFL enables effective learning from partially labeled datasets, significantly improving segmentation accuracy across distributed and non-uniform datasets. In addition to its superior segmentation performance, ConDistFL maintains computational and communication efficiency, ensuring its scalability for real-world applications. Furthermore, ConDistFL demonstrates remarkable generalizability, significantly outperforming existing FL methods in out-of-federation tests, even adapting to unseen contrast phases (e.g., non-contrast CT images) in our experiments. Extensive evaluations on 3D CT and 2D chest X-ray datasets show that ConDistFL is an efficient, adaptable solution for collaborative medical image segmentation in privacy-constrained settings.
