Table of Contents
Fetching ...

Federated Learning with Partially Labeled Data: A Conditional Distillation Approach

Pochuan Wang, Chen Shen, Masahiro Oda, Chiou-Shann Fuh, Kensaku Mori, Weichung Wang, Holger R. Roth

TL;DR

This work tackles the problem of learning generalizable multi-organ and lesion segmentation models under privacy constraints and partial labeling across institutions. It introduces ConDistFL, a framework that couples a supervised loss on locally labeled data with a label-aware conditional distillation loss, enabling knowledge transfer from a global model to clients with missing annotations without requiring extra teacher networks. Key contributions include the foreground/background grouping strategies, conditional probability weighting, and foreground filtering within the ConDist loss, applicable to both 3D CT and 2D chest X-ray modalities, along with comprehensive ablations and out-of-federation evaluation demonstrating robust, cross-domain generalization. The approach achieves state-of-the-art federated performance on in-federation and external datasets, while maintaining practical computation and communication costs, making it suitable for real-world privacy-sensitive multi-center deployments.

Abstract

In medical imaging, developing generalized segmentation models that can handle multiple organs and lesions is crucial. However, the scarcity of fully annotated datasets and strict privacy regulations present significant barriers to data sharing. Federated Learning (FL) allows decentralized model training, but existing FL methods often struggle with partial labeling, leading to model divergence and catastrophic forgetting. We propose ConDistFL, a novel FL framework incorporating conditional distillation to address these challenges. ConDistFL enables effective learning from partially labeled datasets, significantly improving segmentation accuracy across distributed and non-uniform datasets. In addition to its superior segmentation performance, ConDistFL maintains computational and communication efficiency, ensuring its scalability for real-world applications. Furthermore, ConDistFL demonstrates remarkable generalizability, significantly outperforming existing FL methods in out-of-federation tests, even adapting to unseen contrast phases (e.g., non-contrast CT images) in our experiments. Extensive evaluations on 3D CT and 2D chest X-ray datasets show that ConDistFL is an efficient, adaptable solution for collaborative medical image segmentation in privacy-constrained settings.

Federated Learning with Partially Labeled Data: A Conditional Distillation Approach

TL;DR

This work tackles the problem of learning generalizable multi-organ and lesion segmentation models under privacy constraints and partial labeling across institutions. It introduces ConDistFL, a framework that couples a supervised loss on locally labeled data with a label-aware conditional distillation loss, enabling knowledge transfer from a global model to clients with missing annotations without requiring extra teacher networks. Key contributions include the foreground/background grouping strategies, conditional probability weighting, and foreground filtering within the ConDist loss, applicable to both 3D CT and 2D chest X-ray modalities, along with comprehensive ablations and out-of-federation evaluation demonstrating robust, cross-domain generalization. The approach achieves state-of-the-art federated performance on in-federation and external datasets, while maintaining practical computation and communication costs, making it suitable for real-world privacy-sensitive multi-center deployments.

Abstract

In medical imaging, developing generalized segmentation models that can handle multiple organs and lesions is crucial. However, the scarcity of fully annotated datasets and strict privacy regulations present significant barriers to data sharing. Federated Learning (FL) allows decentralized model training, but existing FL methods often struggle with partial labeling, leading to model divergence and catastrophic forgetting. We propose ConDistFL, a novel FL framework incorporating conditional distillation to address these challenges. ConDistFL enables effective learning from partially labeled datasets, significantly improving segmentation accuracy across distributed and non-uniform datasets. In addition to its superior segmentation performance, ConDistFL maintains computational and communication efficiency, ensuring its scalability for real-world applications. Furthermore, ConDistFL demonstrates remarkable generalizability, significantly outperforming existing FL methods in out-of-federation tests, even adapting to unseen contrast phases (e.g., non-contrast CT images) in our experiments. Extensive evaluations on 3D CT and 2D chest X-ray datasets show that ConDistFL is an efficient, adaptable solution for collaborative medical image segmentation in privacy-constrained settings.

Paper Structure

This paper contains 28 sections, 7 equations, 7 figures, 3 tables.

Figures (7)

  • Figure 1: An illustration of the federated learning setup for multi-organ and tumor segmentation using inconsistently labeled datasets. Each client contributes annotations for only a subset of target organs or lesions, reflecting realistic data availability in multi-institutional settings.
  • Figure 2: Illustration of the ConDistFL client training process on a partially labeled dataset, where only the kidney and kidney tumor are annotated. The model is trained to leverage both local and global knowledge for effective segmentation of labeled and unlabeled organs.
  • Figure 3: Out‑of‑federation Dice coefficients (AMOS22 dataset, n=300 unseen CT scans). ConDistFL is highlighted in blue and achieves the highest average Dice as well as the best class‑specific scores for kidney, liver, pancreas, and spleen.
  • Figure 4: Boxplot of test Dice score distributions for global models trained using different federated learning methods on the in-federation test set from the 3D abdominal CT experiments. The Dice scores represent the segmentation performance across multiple clients, with higher scores indicating better agreement between predicted and ground-truth segmentations.
  • Figure 5: Boxplot of test Dice score distributions for global models trained using different federated learning methods on the in-federation test set from the 2D chest X-ray experiments. For the pneumothorax segmentation, the boxplot excludes non-pneumothorax images, displaying only the Dice scores for pneumothorax cases.
  • ...and 2 more figures