Table of Contents
Fetching ...

Validation of Various Normalization Methods for Brain Tumor Segmentation: Can Federated Learning Overcome This Heterogeneity?

Jan Fiszer, Dominika Ciupek, Maciej Malawski

TL;DR

This study addresses privacy and data-heterogeneity challenges in brain tumor segmentation by simulating non-IID conditions through six MRI intensity normalization schemes (five normalized methods plus raw data). It compares centralized training against federated learning variants (FedAvg and FedBN) and finds that FL can achieve near-parity with a centralized model, demonstrated by a 3D Dice score of $92\%$ on held-out data. Nyul normalization emerges as particularly problematic, while Z-score normalization provides broad cross-dataset compatibility; the results support FL as a practical, privacy-preserving approach for multi-site medical imaging tasks. The work also offers actionable guidelines on preprocessing choices and FL deployment, contributing to scalable, collaborative brain tumor segmentation with minimal data sharing.

Abstract

Deep learning (DL) has been increasingly applied in medical imaging, however, it requires large amounts of data, which raises many challenges related to data privacy, storage, and transfer. Federated learning (FL) is a training paradigm that overcomes these issues, though its effectiveness may be reduced when dealing with non-independent and identically distributed (non-IID) data. This study simulates non-IID conditions by applying different MRI intensity normalization techniques to separate data subsets, reflecting a common cause of heterogeneity. These subsets are then used for training and testing models for brain tumor segmentation. The findings provide insights into the influence of the MRI intensity normalization methods on segmentation models, both training and inference. Notably, the FL methods demonstrated resilience to inconsistently normalized data across clients, achieving the 3D Dice score of 92%, which is comparable to a centralized model (trained using all data). These results indicate that FL is a solution to effectively train high-performing models without violating data privacy, a crucial concern in medical applications. The code is available at: https://github.com/SanoScience/fl-varying-normalization.

Validation of Various Normalization Methods for Brain Tumor Segmentation: Can Federated Learning Overcome This Heterogeneity?

TL;DR

This study addresses privacy and data-heterogeneity challenges in brain tumor segmentation by simulating non-IID conditions through six MRI intensity normalization schemes (five normalized methods plus raw data). It compares centralized training against federated learning variants (FedAvg and FedBN) and finds that FL can achieve near-parity with a centralized model, demonstrated by a 3D Dice score of on held-out data. Nyul normalization emerges as particularly problematic, while Z-score normalization provides broad cross-dataset compatibility; the results support FL as a practical, privacy-preserving approach for multi-site medical imaging tasks. The work also offers actionable guidelines on preprocessing choices and FL deployment, contributing to scalable, collaborative brain tumor segmentation with minimal data sharing.

Abstract

Deep learning (DL) has been increasingly applied in medical imaging, however, it requires large amounts of data, which raises many challenges related to data privacy, storage, and transfer. Federated learning (FL) is a training paradigm that overcomes these issues, though its effectiveness may be reduced when dealing with non-independent and identically distributed (non-IID) data. This study simulates non-IID conditions by applying different MRI intensity normalization techniques to separate data subsets, reflecting a common cause of heterogeneity. These subsets are then used for training and testing models for brain tumor segmentation. The findings provide insights into the influence of the MRI intensity normalization methods on segmentation models, both training and inference. Notably, the FL methods demonstrated resilience to inconsistently normalized data across clients, achieving the 3D Dice score of 92%, which is comparable to a centralized model (trained using all data). These results indicate that FL is a solution to effectively train high-performing models without violating data privacy, a crucial concern in medical applications. The code is available at: https://github.com/SanoScience/fl-varying-normalization.

Paper Structure

This paper contains 12 sections, 2 equations, 4 figures.

Figures (4)

  • Figure 1: The big picture of the pipeline used, explaining the structure of the results table in Figure \ref{['fig:dice-table']}. Visualization of the main steps and their dependencies, including: a) the UCSF-PDGM-v3 dataset split, b) subsets normalization, c) models training, d) test sets normalization, and e) resulting evaluation table.
  • Figure 2: Visualization of the histograms for all modalities for each of the normalization methods. Each line corresponds to the brain (without the background) voxels' intensity distribution. The 17 subjects from the common test set volumes were used for this visualization.
  • Figure 3: 3D Dice scores for all the models (rows) and all test sets (columns) with corresponding standard deviations. The star '*' next to the Centralized model indicates that this model violates data privacy. The Average row presents the average over the $GDS$ for single-dataset trained models, and the Average column shows the average Dice among all the test sets for the given model.
  • Figure 4: Presentation of the predictions for each of the trained models on the 0229 patient. It presents the normalized inputs and the models’ predictions with distinction for true positive, false negative, and false positive.