Table of Contents
Fetching ...

Integrating Homomorphic Encryption and Synthetic Data in FL for Privacy and Learning Quality

Yenan Wang, Carla Fabiana Chiasserini, Elad Michael Schiller

TL;DR

This work enhances an FL process that preserves privacy using HE by integrating it with synthetic data generation and an interleaving strategy, and keeps manageable encryption and decryption costs through the authors' interleaving strategy.

Abstract

Federated learning (FL) enables collaborative training of machine learning models without sharing sensitive client data, making it a cornerstone for privacy-critical applications. However, FL faces the dual challenge of ensuring learning quality and robust privacy protection while keeping resource consumption low, particularly when using computationally expensive techniques such as homomorphic encryption (HE). In this work, we enhance an FL process that preserves privacy using HE by integrating it with synthetic data generation and an interleaving strategy. Specifically, our solution, named Alternating Federated Learning (Alt-FL), consists of alternating between local training with authentic data (authentic rounds) and local training with synthetic data (synthetic rounds) and transferring the encrypted and plaintext model parameters on authentic and synthetic rounds (resp.). Our approach improves learning quality (e.g., model accuracy) through datasets enhanced with synthetic data, preserves client data privacy via HE, and keeps manageable encryption and decryption costs through our interleaving strategy. We evaluate our solution against data leakage attacks, such as the DLG attack, demonstrating robust privacy protection. Also, Alt-FL provides 13.4% higher model accuracy and decreases HE-related costs by up to 48% with respect to Selective HE.

Integrating Homomorphic Encryption and Synthetic Data in FL for Privacy and Learning Quality

TL;DR

This work enhances an FL process that preserves privacy using HE by integrating it with synthetic data generation and an interleaving strategy, and keeps manageable encryption and decryption costs through the authors' interleaving strategy.

Abstract

Federated learning (FL) enables collaborative training of machine learning models without sharing sensitive client data, making it a cornerstone for privacy-critical applications. However, FL faces the dual challenge of ensuring learning quality and robust privacy protection while keeping resource consumption low, particularly when using computationally expensive techniques such as homomorphic encryption (HE). In this work, we enhance an FL process that preserves privacy using HE by integrating it with synthetic data generation and an interleaving strategy. Specifically, our solution, named Alternating Federated Learning (Alt-FL), consists of alternating between local training with authentic data (authentic rounds) and local training with synthetic data (synthetic rounds) and transferring the encrypted and plaintext model parameters on authentic and synthetic rounds (resp.). Our approach improves learning quality (e.g., model accuracy) through datasets enhanced with synthetic data, preserves client data privacy via HE, and keeps manageable encryption and decryption costs through our interleaving strategy. We evaluate our solution against data leakage attacks, such as the DLG attack, demonstrating robust privacy protection. Also, Alt-FL provides 13.4% higher model accuracy and decreases HE-related costs by up to 48% with respect to Selective HE.
Paper Structure (6 sections, 2 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 6 sections, 2 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: An overview of the proposed nameNovel.
  • Figure 2: Result of the dlg attack on a specific authentic image (a) when models are sent unencrypted (b), and encrypted with $\eta{=}0.1$ (c) and $0.2$ (d).
  • Figure 3: Attack launched on synthetic rounds when $\rho{=}0.5$: exemplary authentic images (a), (c), (e) and corresponding recovered image for (resp.) max UQI (b), max MSSSIM (d), and max VIF (f).
  • Figure 4: The LeNet5 model's accuracy of nameNovel and its benchmarks until convergence. Stars markers denote the round at which schemes reached peak accuracy.
  • Figure 5: Encryption and decryption time with encryption ratio $\eta{=}0.2$.