Table of Contents
Fetching ...

An advanced data fabric architecture leveraging homomorphic encryption and federated learning

Sakib Anwar Rieyan, Md. Raisul Kabir News, A. B. M. Muntasir Rahman, Sadia Afrin Khan, Sultan Tasneem Jawad Zaarif, Md. Golam Rabiul Alam, Mohammad Mehedi Hassan, Michele Ianni, Giancarlo Fortino

TL;DR

This paper tackles secure medical image analysis by integrating a data fabric architecture with Federated Learning (FL) and Partially Homomorphic Encryption (PHE). It introduces an advanced data fabric design and a FedMax-based FL framework that stores encrypted model updates in a Data Lake, enabling collaborative training without exposing raw patient data while remaining compliant with HIPAA and GDPR. Using a pituitary tumor MRI classification task, the study compares multiple pre-trained CNNs and a custom CNN, finding that the custom model achieves the best overall performance (83.31% accuracy) on encrypted data, with reasonable performance for the other models and notable privacy advantages. The work demonstrates the feasibility and practical impact of privacy-preserving, distributed ML in healthcare and suggests applicability to other privacy-sensitive domains.

Abstract

Data fabric is an automated and AI-driven data fusion approach to accomplish data management unification without moving data to a centralized location for solving complex data problems. In a Federated learning architecture, the global model is trained based on the learned parameters of several local models that eliminate the necessity of moving data to a centralized repository for machine learning. This paper introduces a secure approach for medical image analysis using federated learning and partially homomorphic encryption within a distributed data fabric architecture. With this method, multiple parties can collaborate in training a machine-learning model without exchanging raw data but using the learned or fused features. The approach complies with laws and regulations such as HIPAA and GDPR, ensuring the privacy and security of the data. The study demonstrates the method's effectiveness through a case study on pituitary tumor classification, achieving a significant level of accuracy. However, the primary focus of the study is on the development and evaluation of federated learning and partially homomorphic encryption as tools for secure medical image analysis. The results highlight the potential of these techniques to be applied to other privacy-sensitive domains and contribute to the growing body of research on secure and privacy-preserving machine learning.

An advanced data fabric architecture leveraging homomorphic encryption and federated learning

TL;DR

This paper tackles secure medical image analysis by integrating a data fabric architecture with Federated Learning (FL) and Partially Homomorphic Encryption (PHE). It introduces an advanced data fabric design and a FedMax-based FL framework that stores encrypted model updates in a Data Lake, enabling collaborative training without exposing raw patient data while remaining compliant with HIPAA and GDPR. Using a pituitary tumor MRI classification task, the study compares multiple pre-trained CNNs and a custom CNN, finding that the custom model achieves the best overall performance (83.31% accuracy) on encrypted data, with reasonable performance for the other models and notable privacy advantages. The work demonstrates the feasibility and practical impact of privacy-preserving, distributed ML in healthcare and suggests applicability to other privacy-sensitive domains.

Abstract

Data fabric is an automated and AI-driven data fusion approach to accomplish data management unification without moving data to a centralized location for solving complex data problems. In a Federated learning architecture, the global model is trained based on the learned parameters of several local models that eliminate the necessity of moving data to a centralized repository for machine learning. This paper introduces a secure approach for medical image analysis using federated learning and partially homomorphic encryption within a distributed data fabric architecture. With this method, multiple parties can collaborate in training a machine-learning model without exchanging raw data but using the learned or fused features. The approach complies with laws and regulations such as HIPAA and GDPR, ensuring the privacy and security of the data. The study demonstrates the method's effectiveness through a case study on pituitary tumor classification, achieving a significant level of accuracy. However, the primary focus of the study is on the development and evaluation of federated learning and partially homomorphic encryption as tools for secure medical image analysis. The results highlight the potential of these techniques to be applied to other privacy-sensitive domains and contribute to the growing body of research on secure and privacy-preserving machine learning.
Paper Structure (20 sections, 21 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 21 figures, 3 tables, 1 algorithm.

Figures (21)

  • Figure 1: Vanilla Architecture of Data Fabric
  • Figure 2: Federated Learning Model 33
  • Figure 3: An Overview of the Homomorphic Encryption
  • Figure 4: A few samples of the used Brain Tumor dataset 29
  • Figure 5: Overview of the encrypted dataset
  • ...and 16 more figures