FedMAP: Personalised Federated Learning for Real Large-Scale Healthcare Systems

Fan Zhang; Daniel Kreuter; Carlos Esteve-Yagüe; Sören Dittmer; Javier Fernandez-Marques; Samantha Ip; BloodCounts! Consortium; Norbert C. J. de Wit; Angela Wood; James HF Rudd; Nicholas Lane; Nicholas S Gleadall; Carola-Bibiane Schönlieb; Michael Roberts

FedMAP: Personalised Federated Learning for Real Large-Scale Healthcare Systems

Fan Zhang, Daniel Kreuter, Carlos Esteve-Yagüe, Sören Dittmer, Javier Fernandez-Marques, Samantha Ip, BloodCounts! Consortium, Norbert C. J. de Wit, Angela Wood, James HF Rudd, Nicholas Lane, Nicholas S Gleadall, Carola-Bibiane Schönlieb, Michael Roberts

TL;DR

This work tackles statistical heterogeneity in large-scale healthcare federated learning by introducing FedMAP, a personalised FL framework that combines local Maximum a Posteriori estimation with Input Convex Neural Network priors to capture inter-site relationships. FedMAP provides convergence guarantees for a bi-level optimisation where locally trained models are regularised by a learnable global prior, enabling adaptive knowledge sharing across non-IID data. The authors demonstrate improvements over FedAvg and several PFL methods across three real-world clinical datasets, and offer a three-tier deployment design (full FL, local fine-tuning, and inference-only) to accommodate diverse health systems. Importantly, FedMAP reduces regional disparities and supports equitable, privacy-preserving deployment in practice, including applications to CPRD cardiovascular risk, INTERVAL iron deficiency detection, and eICU mortality prediction. The framework thus presents a practical pathway toward scalable, equitable, and privacy-preserving healthcare FL at an institutional scale.

Abstract

Federated learning (FL) promises to enable collaborative machine learning across healthcare sites whilst preserving data privacy. Practical deployment remains limited by statistical heterogeneity arising from differences in patient demographics, treatments, and outcomes, and infrastructure constraints. We introduce FedMAP, a personalised FL (PFL) framework that addresses heterogeneity through local Maximum a Posteriori (MAP) estimation with Input Convex Neural Network priors. These priors represent global knowledge gathered from other sites that guides the model while adapting to local data, and we provide a formal proof of convergence. Unlike many PFL methods that rely on fixed regularisation, FedMAP's prior adaptively learns patterns that capture complex inter-site relationships. We demonstrate improved performance compared to local training, FedAvg, and several PFL methods across three large-scale clinical datasets: 10-year cardiovascular risk prediction (CPRD, 387 general practitioner practices, 258,688 patients), iron deficiency detection (INTERVAL, 4 donor centres, 31,949 blood donors), and mortality prediction (eICU, 150 hospitals, 44,842 patients). FedMAP incorporates a three-tier design that enables participation across healthcare sites with varying infrastructure and technical capabilities, from full federated training to inference-only deployment. Geographical analysis reveals substantial equity improvements, with underperforming regions achieving up to 14.3% performance gains. This framework provides the first practical pathway for large-scale healthcare FL deployment, which ensures clinical sites at all scales can benefit, equity is enhanced, and privacy is retained.

FedMAP: Personalised Federated Learning for Real Large-Scale Healthcare Systems

TL;DR

Abstract

Paper Structure (13 sections, 3 theorems, 33 equations, 6 figures, 11 tables, 5 algorithms)

This paper contains 13 sections, 3 theorems, 33 equations, 6 figures, 11 tables, 5 algorithms.

Initialisation
Local Optimisation
Global Aggregation
Tier Grouping
Train Test Split Strategy
Baseline Methods
Implementation Details
Proof of Theorem \ref{['thm:main paper']}
Model Architectures
Supplementary Results
Statistical comparisons of FedMAP against other methods in FL, fine-tuning, and inference experiments
Inference-only performance
Linear regression of relationship between individual site performance and FL benefit

Key Result

Theorem 1

Let $\Theta\subset \mathbb{R}^d$ be compact and convex, and let $\Gamma\subset \mathbb{R}^p$ be convex. Assume that $\theta\mapsto \mathcal{L}(\theta; Z_k)$ is continuous and convex, and that $(\theta, \gamma)\mapsto \mathcal{R}(\theta; \gamma)$ is differentiable and strongly convexSee strong convex

Figures (6)

Figure 1: FedMAP architecture diagram.A, Local training (step 1) and server prior estimation (step 2) of the proposed FedMAP algorithm for FL. B, Diagram showcasing Tier 1 (FL), Tier 2 (local fine-tuning, for sites with limited IT networking resources), and Tier 3 (local inference-only use, for sites with limited IT networking resources and limited data availability) of FedMAP's multi-tier service design.
Figure 2: Model performance across deployment tiers.a, Tier 1 (T1): full federated training. b, Tier 2 (T2): local fine-tuning. c, Tier 3 (T3): inference-only. Performances are measured on local test sets. The number of sites in each group is indicated in the top left of the plots. Boxplot boxes indicate median and first-to-third quartile range of the data. Whiskers extend to 1.5 IQRs. IQR, interquartile range.
Figure 3: FL performance gains versus individual site performance. Each point represents an eICU site showing AUROC from individual training (x-axis) and relative change using FedMAP (y-axis). The solid line shows a linear regression fit with 95% confidence interval (shaded region). The dashed line indicates zero improvement.
Figure 4: 10-year CVD prediction model performance collated by NHS England regions. Colours indicate median C-index per region for the fine-tuning sites (Tier 2). n, number of sites per region.
Figure 5: Healthcare datasets. We used data from the Clinical Practice Research Datalink (CPRD) herrett_data_2015xu_prediction_2021, the INTERVAL trial angelantonio_efficiency_2017, and data from the eICU Collaborative Research Database pollard_eicu_2018. a, Heatmaps of Wasserstein distance between feature distributions of up to ten sites of each dataset (top ten sites with largest distance). b, Radial plot displaying sample and class distribution over the sites for the three datasets.
...and 1 more figures

Theorems & Definitions (5)

Theorem 1
Corollary 1
Lemma 2
proof
proof : Proof of Theorem \ref{['thm:main paper']}

FedMAP: Personalised Federated Learning for Real Large-Scale Healthcare Systems

TL;DR

Abstract

FedMAP: Personalised Federated Learning for Real Large-Scale Healthcare Systems

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (5)