In-depth Analysis of Privacy Threats in Federated Learning for Medical Data
Badhan Chandra Das, M. Hadi Amini, Yanzhao Wu
TL;DR
The paper tackles privacy threats in federated learning for medical data by introducing MedPFL, a framework that enables systematic privacy risk analysis and mitigation using real-world medical datasets, diverse models, and multiple attack/defense mechanisms. Through extensive experiments on melanoma, COVID-19 X-ray, and brain MRI datasets, the authors demonstrate that adversaries can reconstruct private medical images from shared gradients using CPL, DLG, iDLG, and GradInv, with GradInv often yielding the highest reconstruction quality on untrained models. They further show that simple gradient perturbation with Laplacian noise does not always provide robust protection for medical images, revealing a fundamental challenge in privacy-preserving FL for healthcare. The work highlights the need for medical-data–tailored privacy-preserving techniques in FL and outlines a path for future research on stronger defenses and broader clinical tasks.
Abstract
Federated learning is emerging as a promising machine learning technique in the medical field for analyzing medical images, as it is considered an effective method to safeguard sensitive patient data and comply with privacy regulations. However, recent studies have revealed that the default settings of federated learning may inadvertently expose private training data to privacy attacks. Thus, the intensity of such privacy risks and potential mitigation strategies in the medical domain remain unclear. In this paper, we make three original contributions to privacy risk analysis and mitigation in federated learning for medical data. First, we propose a holistic framework, MedPFL, for analyzing privacy risks in processing medical data in the federated learning environment and developing effective mitigation strategies for protecting privacy. Second, through our empirical analysis, we demonstrate the severe privacy risks in federated learning to process medical images, where adversaries can accurately reconstruct private medical images by performing privacy attacks. Third, we illustrate that the prevalent defense mechanism of adding random noises may not always be effective in protecting medical images against privacy attacks in federated learning, which poses unique and pressing challenges related to protecting the privacy of medical data. Furthermore, the paper discusses several unique research questions related to the privacy protection of medical data in the federated learning environment. We conduct extensive experiments on several benchmark medical image datasets to analyze and mitigate the privacy risks associated with federated learning for medical data.
