FedMAP: Personalised Federated Learning for Real Large-Scale Healthcare Systems
Fan Zhang, Daniel Kreuter, Carlos Esteve-Yagüe, Sören Dittmer, Javier Fernandez-Marques, Samantha Ip, BloodCounts! Consortium, Norbert C. J. de Wit, Angela Wood, James HF Rudd, Nicholas Lane, Nicholas S Gleadall, Carola-Bibiane Schönlieb, Michael Roberts
TL;DR
This work tackles statistical heterogeneity in large-scale healthcare federated learning by introducing FedMAP, a personalised FL framework that combines local Maximum a Posteriori estimation with Input Convex Neural Network priors to capture inter-site relationships. FedMAP provides convergence guarantees for a bi-level optimisation where locally trained models are regularised by a learnable global prior, enabling adaptive knowledge sharing across non-IID data. The authors demonstrate improvements over FedAvg and several PFL methods across three real-world clinical datasets, and offer a three-tier deployment design (full FL, local fine-tuning, and inference-only) to accommodate diverse health systems. Importantly, FedMAP reduces regional disparities and supports equitable, privacy-preserving deployment in practice, including applications to CPRD cardiovascular risk, INTERVAL iron deficiency detection, and eICU mortality prediction. The framework thus presents a practical pathway toward scalable, equitable, and privacy-preserving healthcare FL at an institutional scale.
Abstract
Federated learning (FL) promises to enable collaborative machine learning across healthcare sites whilst preserving data privacy. Practical deployment remains limited by statistical heterogeneity arising from differences in patient demographics, treatments, and outcomes, and infrastructure constraints. We introduce FedMAP, a personalised FL (PFL) framework that addresses heterogeneity through local Maximum a Posteriori (MAP) estimation with Input Convex Neural Network priors. These priors represent global knowledge gathered from other sites that guides the model while adapting to local data, and we provide a formal proof of convergence. Unlike many PFL methods that rely on fixed regularisation, FedMAP's prior adaptively learns patterns that capture complex inter-site relationships. We demonstrate improved performance compared to local training, FedAvg, and several PFL methods across three large-scale clinical datasets: 10-year cardiovascular risk prediction (CPRD, 387 general practitioner practices, 258,688 patients), iron deficiency detection (INTERVAL, 4 donor centres, 31,949 blood donors), and mortality prediction (eICU, 150 hospitals, 44,842 patients). FedMAP incorporates a three-tier design that enables participation across healthcare sites with varying infrastructure and technical capabilities, from full federated training to inference-only deployment. Geographical analysis reveals substantial equity improvements, with underperforming regions achieving up to 14.3% performance gains. This framework provides the first practical pathway for large-scale healthcare FL deployment, which ensures clinical sites at all scales can benefit, equity is enhanced, and privacy is retained.
