Federated Learning for Medical Image Classification: A Comprehensive Benchmark

Zhekai Zhou; Guibo Luo; Mingzhi Chen; Zhenyu Weng; Yuesheng Zhu

Federated Learning for Medical Image Classification: A Comprehensive Benchmark

Zhekai Zhou, Guibo Luo, Mingzhi Chen, Zhenyu Weng, Yuesheng Zhu

TL;DR

This work benchmarks a broad set of federated learning algorithms on medical image classification using real multi-center datasets, revealing that no single method consistently wins across all scenarios. It introduces a concise yet effective augmentation strategy that combines conditional denoising diffusion probabilistic models with label smoothing to augment local data while preserving privacy, yielding performance close to centralized training on many tasks. The study demonstrates that diffusion-based augmentation reduces feature shift among clients, accelerates convergence, and improves final accuracy, suggesting practical paths for deploying FL in clinical imaging. The authors provide actionable guidelines for method selection based on data volume, computation, and communication constraints, and release code to support future research and benchmarking in medical FL.

Abstract

The federated learning paradigm is wellsuited for the field of medical image analysis, as it can effectively cope with machine learning on isolated multicenter data while protecting the privacy of participating parties. However, current research on optimization algorithms in federated learning often focuses on limited datasets and scenarios, primarily centered around natural images, with insufficient comparative experiments in medical contexts. In this work, we conduct a comprehensive evaluation of several state-of-the-art federated learning algorithms in the context of medical imaging. We conduct a fair comparison of classification models trained using various federated learning algorithms across multiple medical imaging datasets. Additionally, we evaluate system performance metrics, such as communication cost and computational efficiency, while considering different federated learning architectures. Our findings show that medical imaging datasets pose substantial challenges for current federated learning optimization algorithms. No single algorithm consistently delivers optimal performance across all medical federated learning scenarios, and many optimization algorithms may underperform when applied to these datasets. Our experiments provide a benchmark and guidance for future research and application of federated learning in medical imaging contexts. Furthermore, we propose an efficient and robust method that combines generative techniques using denoising diffusion probabilistic models with label smoothing to augment datasets, widely enhancing the performance of federated learning on classification tasks across various medical imaging datasets. Our code will be released on GitHub, offering a reliable and comprehensive benchmark for future federated learning studies in medical imaging.

Federated Learning for Medical Image Classification: A Comprehensive Benchmark

TL;DR

Abstract

Federated Learning for Medical Image Classification: A Comprehensive Benchmark

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)