Table of Contents
Fetching ...

Federated Learning: A new frontier in the exploration of multi-institutional medical imaging data

Dominika Ciupek, Maciej Malawski, Tomasz Pieciak

TL;DR

This review frames federated learning as a practical path to harness multi-institution medical imaging data while preserving patient privacy. It rigorously surveys FL theory, aggregation and learning algorithms, data privacy techniques, system architectures, and real-world clinical deployments, emphasizing challenges from data and model heterogeneity to malicious actors and communication constraints. The authors synthesize MI-specific aggregation/learning methods, discuss open-source frameworks, and illustrate real-world deployments and their hurdles, offering issue–method–effect mappings and guidance for future development. The work highlights the central role of tools like FedAvg, personalized and split-learning strategies, and privacy-preserving techniques in advancing clinically relevant FL systems across diverse imaging modalities. Overall, FL is positioned as a transformative framework for secure, scalable, multi-site medical imaging AI with actionable directions for standardization, validation, and deployment in clinical workflows.

Abstract

Artificial intelligence has transformed the perspective of medical imaging, leading to a genuine technological revolution in modern computer-assisted healthcare systems. However, ubiquitously featured deep learning (DL) systems require access to a considerable amount of data, facilitating proper knowledge extraction and generalization. Access to such extensive resources may be hindered due to the time and effort required to convey ethical agreements, set up and carry the acquisition procedures through, and manage the datasets adequately with a particular emphasis on proper anonymization. One of the pivotal challenges in the DL field is data integration from various sources acquired using different hardware vendors, diverse acquisition protocols, experimental setups, and even inter-operator variabilities. In this paper, we review the federated learning (FL) concept that fosters the integration of large-scale heterogeneous datasets from multiple institutions in training DL models. In contrast to a centralized approach, the decentralized FL procedure promotes training DL models while preserving data privacy at each institution involved. We formulate the FL principle and comprehensively review general and specialized medical imaging aggregation and learning algorithms, enabling the generation of a globally generalized model. We meticulously go through the challenges in constructing FL-based systems, such as data and model heterogeneities across the institutions, resilience to potential attacks on data privacy, and the variability in computational and communication resources among the entangled sites that might induce efficiency issues of the entire system. Finally, we explore the up-to-date open frameworks for rapid FL-based algorithm prototyping, comprehensively present real-world implementations of FL systems and shed light on future directions in this intensively growing field.

Federated Learning: A new frontier in the exploration of multi-institutional medical imaging data

TL;DR

This review frames federated learning as a practical path to harness multi-institution medical imaging data while preserving patient privacy. It rigorously surveys FL theory, aggregation and learning algorithms, data privacy techniques, system architectures, and real-world clinical deployments, emphasizing challenges from data and model heterogeneity to malicious actors and communication constraints. The authors synthesize MI-specific aggregation/learning methods, discuss open-source frameworks, and illustrate real-world deployments and their hurdles, offering issue–method–effect mappings and guidance for future development. The work highlights the central role of tools like FedAvg, personalized and split-learning strategies, and privacy-preserving techniques in advancing clinically relevant FL systems across diverse imaging modalities. Overall, FL is positioned as a transformative framework for secure, scalable, multi-site medical imaging AI with actionable directions for standardization, validation, and deployment in clinical workflows.

Abstract

Artificial intelligence has transformed the perspective of medical imaging, leading to a genuine technological revolution in modern computer-assisted healthcare systems. However, ubiquitously featured deep learning (DL) systems require access to a considerable amount of data, facilitating proper knowledge extraction and generalization. Access to such extensive resources may be hindered due to the time and effort required to convey ethical agreements, set up and carry the acquisition procedures through, and manage the datasets adequately with a particular emphasis on proper anonymization. One of the pivotal challenges in the DL field is data integration from various sources acquired using different hardware vendors, diverse acquisition protocols, experimental setups, and even inter-operator variabilities. In this paper, we review the federated learning (FL) concept that fosters the integration of large-scale heterogeneous datasets from multiple institutions in training DL models. In contrast to a centralized approach, the decentralized FL procedure promotes training DL models while preserving data privacy at each institution involved. We formulate the FL principle and comprehensively review general and specialized medical imaging aggregation and learning algorithms, enabling the generation of a globally generalized model. We meticulously go through the challenges in constructing FL-based systems, such as data and model heterogeneities across the institutions, resilience to potential attacks on data privacy, and the variability in computational and communication resources among the entangled sites that might induce efficiency issues of the entire system. Finally, we explore the up-to-date open frameworks for rapid FL-based algorithm prototyping, comprehensively present real-world implementations of FL systems and shed light on future directions in this intensively growing field.

Paper Structure

This paper contains 46 sections, 38 equations, 14 figures, 9 tables.

Figures (14)

  • Figure 1: Comparison between centralized and federated learning approaches: A. In a centralized architecture, the institutions (here, 1, 2, 3) transfer their local datasets to the central server. Other centers (Institution 4) extract datasets from the global server or use its computing infrastructure to train the DL models. B. Each institution's data remains locally preserved in a federated architecture while the parameters of locally trained models $\mathbf{\Theta}_c$ are transferred to the central server. The central server aggregates received parameters and sends back the parameters of a global model $\mathbf{\Theta}$ to each center.
  • Figure 2: The number of screened research articles on applications of federated learning in medical imaging per year. The results were obtained from the MEDLINE and Google Scholar databases and followed the procedure presented in section \ref{['sec:search_strategy']}. Data is up to date as of November 2025.
  • Figure 3: The flow chart presents the screening procedure used for collecting the research papers.
  • Figure 4: A. Distribution of imaging modalities and B. deep learning tasks in the examined federated learning research papers in this review ($n=154$).
  • Figure 5: A,B Distribution of aggregation strategies and C,D learning methods, categorized by imaging modality (left diagrams) and deep learning task (right diagrams) across $n=154$ FL-based research articles considered in this study.
  • ...and 9 more figures