A Quantization-based Technique for Privacy Preserving Distributed Learning

Maurizio Colombo; Rasool Asal; Ernesto Damiani; Lamees Mahmoud AlQassem; Al Anoud Almemari; Yousof Alhammadi

A Quantization-based Technique for Privacy Preserving Distributed Learning

Maurizio Colombo, Rasool Asal, Ernesto Damiani, Lamees Mahmoud AlQassem, Al Anoud Almemari, Yousof Alhammadi

TL;DR

The paper tackles data privacy in distributed ML under regulatory regimes by introducing Hash-Comb, a quantization-based data representation that protects both training data and model parameters. It combines randomized quantization with a multi-hash encoding and secures hyperparameters via MPC/secret sharing, enabling regulation-compliant distributed training across architectures. The authors prove that the scheme achieves Rényi differential privacy bounds and show through experiments on SPAM, IoT23, and Cardiovascular datasets that it can improve accuracy and convergence while reducing communication. Compared with classic DP noise baselines, Hash-Comb provides a favorable privacy-utility trade-off with a smaller practical footprint, and is suitable for both monolithic and federated learning lifecycles.

Abstract

The massive deployment of Machine Learning (ML) models raises serious concerns about data protection. Privacy-enhancing technologies (PETs) offer a promising first step, but hard challenges persist in achieving confidentiality and differential privacy in distributed learning. In this paper, we describe a novel, regulation-compliant data protection technique for the distributed training of ML models, applicable throughout the ML life cycle regardless of the underlying ML architecture. Designed from the data owner's perspective, our method protects both training data and ML model parameters by employing a protocol based on a quantized multi-hash data representation Hash-Comb combined with randomization. The hyper-parameters of our scheme can be shared using standard Secure Multi-Party computation protocols. Our experimental results demonstrate the robustness and accuracy-preserving properties of our approach.

A Quantization-based Technique for Privacy Preserving Distributed Learning

TL;DR

Abstract

Paper Structure (21 sections, 18 equations, 6 figures, 5 tables, 1 algorithm)

This paper contains 21 sections, 18 equations, 6 figures, 5 tables, 1 algorithm.

Introduction
Background and Problem Statement
Differential Privacy
Related Work
Achieving Differential Privacy via Quantization
Rényi Differential Privacy
Multi-Hash Representation
The Negotiation Protocol
Secret Sharing
The MPC protocol
Brute Force attack analysis
Training Models with Hash-Combs
Datasets
ML Model
Monolithic Training
...and 6 more sections

Figures (6)

Figure 1: Hash-Comb levels as numbers of consecutive tosses of a biased coin.
Figure 2: Solution for p when $\overline{k} = 8$.
Figure 3: Some of the most relevant features in SPAM dataset
Figure 4: Training and Testing (last step) from Experiment 1
Figure 5: Model validation score at each FedAvg iteration from Experiment 2
...and 1 more figures

Theorems & Definitions (1)

Definition 1.1

A Quantization-based Technique for Privacy Preserving Distributed Learning

TL;DR

Abstract

A Quantization-based Technique for Privacy Preserving Distributed Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (6)

Theorems & Definitions (1)