Table of Contents
Fetching ...

A Differentially Private Blockchain-Based Approach for Vertical Federated Learning

Linh Tran, Sanjay Chari, Md. Saikat Islam Khan, Aaron Zachariah, Stacy Patterson, Oshani Seneviratne

TL;DR

This work tackles data privacy and trust in vertical federated learning by introducing DP-BBVFL, a serverless framework that uses a private blockchain as the aggregation layer and applies local differential privacy via the Poisson Binomial Mechanism to protect embeddings exposed on-chain. It provides provable privacy guarantees in the form of $$(\alpha,\epsilon)$$-RDP and achieves verifiability through smart contracts and on-chain aggregation, while reducing reliance on a central trusted party. Empirical evaluation on Breast Cancer and MIMIC-III demonstrates competitive accuracy with a clear privacy-utility trade-off, albeit with increased training time due to on-chain operations. The approach broadens the applicability of privacy-preserving distributed learning to sensitive domains like healthcare by enabling transparent, trustworthy collaboration across decentralized institutions.

Abstract

We present the Differentially Private Blockchain-Based Vertical Federal Learning (DP-BBVFL) algorithm that provides verifiability and privacy guarantees for decentralized applications. DP-BBVFL uses a smart contract to aggregate the feature representations, i.e., the embeddings, from clients transparently. We apply local differential privacy to provide privacy for embeddings stored on a blockchain, hence protecting the original data. We provide the first prototype application of differential privacy with blockchain for vertical federated learning. Our experiments with medical data show that DP-BBVFL achieves high accuracy with a tradeoff in training time due to on-chain aggregation. This innovative fusion of differential privacy and blockchain technology in DP-BBVFL could herald a new era of collaborative and trustworthy machine learning applications across several decentralized application domains.

A Differentially Private Blockchain-Based Approach for Vertical Federated Learning

TL;DR

This work tackles data privacy and trust in vertical federated learning by introducing DP-BBVFL, a serverless framework that uses a private blockchain as the aggregation layer and applies local differential privacy via the Poisson Binomial Mechanism to protect embeddings exposed on-chain. It provides provable privacy guarantees in the form of -RDP and achieves verifiability through smart contracts and on-chain aggregation, while reducing reliance on a central trusted party. Empirical evaluation on Breast Cancer and MIMIC-III demonstrates competitive accuracy with a clear privacy-utility trade-off, albeit with increased training time due to on-chain operations. The approach broadens the applicability of privacy-preserving distributed learning to sensitive domains like healthcare by enabling transparent, trustworthy collaboration across decentralized institutions.

Abstract

We present the Differentially Private Blockchain-Based Vertical Federal Learning (DP-BBVFL) algorithm that provides verifiability and privacy guarantees for decentralized applications. DP-BBVFL uses a smart contract to aggregate the feature representations, i.e., the embeddings, from clients transparently. We apply local differential privacy to provide privacy for embeddings stored on a blockchain, hence protecting the original data. We provide the first prototype application of differential privacy with blockchain for vertical federated learning. Our experiments with medical data show that DP-BBVFL achieves high accuracy with a tradeoff in training time due to on-chain aggregation. This innovative fusion of differential privacy and blockchain technology in DP-BBVFL could herald a new era of collaborative and trustworthy machine learning applications across several decentralized application domains.
Paper Structure (18 sections, 1 theorem, 5 equations, 3 figures, 1 table, 2 algorithms)

This paper contains 18 sections, 1 theorem, 5 equations, 3 figures, 1 table, 2 algorithms.

Key Result

Theorem 4.1

For any $b\in \mathbb{N}$ and $\beta \in [0, \frac{1}{4}]$, Algorithm alg satisfies $(\alpha, \epsilon(\alpha))$-RDP for any $\alpha > 1$ and where $C_0$ is a universal constant.

Figures (3)

  • Figure 1: Example of system model with $3$ clients.
  • Figure 2: AUROC Score on Breast Cancer dataset. NPQ (No Privacy noise and No Quantization) represents the baseline case without privacy noise and quantization.
  • Figure 3: F1 Score on MIMIC-III dataset. NPQ (No Privacy noise and No Quantization) represents the baseline case without privacy noise and quantization.

Theorems & Definitions (1)

  • Theorem 4.1