Securing Health Data on the Blockchain: A Differential Privacy and Federated Learning Framework
Daniel Commey, Sena Hounsinou, Garth V. Crosby
TL;DR
The paper tackles privacy-preserving health data analytics in BIoT by merging Federated Learning, Differential Privacy, and blockchain. It introduces dynamic personalization and adaptive noise distribution to handle data heterogeneity while preserving utility under formal privacy guarantees, with secure aggregation and immutable storage enabled by Ethereum smart contracts and IPFS. Experimental results on SVHN show that higher privacy budgets (ε) yield higher accuracy (e.g., 64.50% at ε = 8.0 after 15 rounds), while maintaining manageable DP-induced loss and stable blockchain performance (≈6s latency, ~22k gas, ~196 MB IPFS sizes). This work demonstrates a practical, end-to-end privacy-preserving framework for healthcare analytics in BIoT environments, highlighting the feasibility of secure, decentralized model updates and transparent data provenance. Implications include improved data privacy, data integrity, and trust in collaborative health analytics, with future work on medical datasets and scalability improvements.
Abstract
This study proposes a framework to enhance privacy in Blockchain-based Internet of Things (BIoT) systems used in the healthcare sector. The framework addresses the challenge of leveraging health data for analytics while protecting patient privacy. To achieve this, the study integrates Differential Privacy (DP) with Federated Learning (FL) to protect sensitive health data collected by IoT nodes. The proposed framework utilizes dynamic personalization and adaptive noise distribution strategies to balance privacy and data utility. Additionally, blockchain technology ensures secure and transparent aggregation and storage of model updates. Experimental results on the SVHN dataset demonstrate that the proposed framework achieves strong privacy guarantees against various attack scenarios while maintaining high accuracy in health analytics tasks. For 15 rounds of federated learning with an epsilon value of 8.0, the model obtains an accuracy of 64.50%. The blockchain integration, utilizing Ethereum, Ganache, Web3.py, and IPFS, exhibits an average transaction latency of around 6 seconds and consistent gas consumption across rounds, validating the practicality and feasibility of the proposed approach.
