FedDPGAN: Federated Differentially Private Generative Adversarial Networks Framework for the Detection of COVID-19 Pneumonia
Longling Zhang, Bochen Shen, Ahmed Barnawi, Shan Xi, Neeraj Kumar, Yi Wu
TL;DR
The paper addresses privacy barriers in using AI for COVID-19 chest X-ray diagnosis by combining Federated Learning with a Differentially Private GAN. The proposed FedDPGAN framework trains distributed DPGANs across hospitals, adds Gaussian noise to gradients to satisfy $(\varepsilon,\delta)$-DP, and aggregates updates via FedAvg to form a global model without sharing raw data. Empirical results on a COVID-19 CXR dataset show FedDPGAN achieves higher accuracy than centralized and non-DP federated baselines while providing privacy protection, and the DP-GAN component helps alleviate non-IID challenges through data augmentation. The work advances privacy-preserving, scalable, and effective COVID-19 diagnostics in distributed medical settings with potential impact on privacy-aware smart healthcare systems.
Abstract
Existing deep learning technologies generally learn the features of chest X-ray data generated by Generative Adversarial Networks (GAN) to diagnose COVID-19 pneumonia. However, the above methods have a critical challenge: data privacy. GAN will leak the semantic information of the training data which can be used to reconstruct the training samples by attackers, thereby this method will leak the privacy of the patient. Furthermore, for this reason that is the limitation of the training data sample, different hospitals jointly train the model through data sharing, which will also cause the privacy leakage. To solve this problem, we adopt the Federated Learning (FL) frame-work which is a new technique being used to protect the data privacy. Under the FL framework and Differentially Private thinking, we propose a FederatedDifferentially Private Generative Adversarial Network (FedDPGAN) to detectCOVID-19 pneumonia for sustainable smart cities. Specifically, we use DP-GAN to privately generate diverse patient data in which differential privacy technology is introduced to make sure the privacy protection of the semantic information of training dataset. Furthermore, we leverage FL to allow hospitals to collaboratively train COVID-19 models without sharing the original data. Under Independent and Identically Distributed (IID) and non-IID settings, The evaluation of the proposed model is on three types of chest X-ray (CXR) images dataset (COVID-19, normal, and normal pneumonia). A large number of the truthful reports make the verification of our model can effectively diagnose COVID-19 without compromising privacy.
