Table of Contents
Fetching ...

Decentralized Collaborative Learning Framework with External Privacy Leakage Analysis

Tsuyoshi Idé, Dzung T. Phan, Rudy Raymond

TL;DR

The paper tackles privacy-preserving decentralized multi-task density estimation for unsupervised anomaly detection in blockchain-like networks. It extends the CollabDict framework to support deep models via a multi-task VAE, enabling expressive anomaly detection while maintaining decentralization. It provides a theoretical external privacy leakage guarantee using Rényi differential privacy for Gaussian mixture trials and introduces a practical entropy-based monitor for internal privacy breaches. The work demonstrates how to combine dynamical consensus, random data chunking, and Bayesian DL to enable privacy-aware collaborative learning with potential applications in next-generation blockchain platforms.

Abstract

This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoencoders (VAEs) into the framework, with a particular focus on anomaly detection. We demonstrate that the VAE-based anomaly score function shares the same mathematical structure as the non-deep model, and provide comprehensive qualitative comparison. Second, considering the widespread use of "pre-trained models," we provide a mathematical analysis on data privacy leakage when models trained with CollabDict are shared externally. We show that the CollabDict approach, when applied to Gaussian mixtures, adheres to a Renyi differential privacy criterion. Additionally, we propose a practical metric for monitoring internal privacy breaches during the learning process.

Decentralized Collaborative Learning Framework with External Privacy Leakage Analysis

TL;DR

The paper tackles privacy-preserving decentralized multi-task density estimation for unsupervised anomaly detection in blockchain-like networks. It extends the CollabDict framework to support deep models via a multi-task VAE, enabling expressive anomaly detection while maintaining decentralization. It provides a theoretical external privacy leakage guarantee using Rényi differential privacy for Gaussian mixture trials and introduces a practical entropy-based monitor for internal privacy breaches. The work demonstrates how to combine dynamical consensus, random data chunking, and Bayesian DL to enable privacy-aware collaborative learning with potential applications in next-generation blockchain platforms.

Abstract

This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoencoders (VAEs) into the framework, with a particular focus on anomaly detection. We demonstrate that the VAE-based anomaly score function shares the same mathematical structure as the non-deep model, and provide comprehensive qualitative comparison. Second, considering the widespread use of "pre-trained models," we provide a mathematical analysis on data privacy leakage when models trained with CollabDict are shared externally. We show that the CollabDict approach, when applied to Gaussian mixtures, adheres to a Renyi differential privacy criterion. Additionally, we propose a practical metric for monitoring internal privacy breaches during the learning process.
Paper Structure (21 sections, 2 theorems, 56 equations, 3 figures, 1 table, 1 algorithm)

This paper contains 21 sections, 2 theorems, 56 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

We assume that any pairwise $\ell_2$ distance between the samples is upper-bounded by $R < \infty$, and the components whose $\bar{N}_k < \delta$ are discarded after the local_update procedure in Algorithm algo:CoDiBlock, and $\lambda_0 >0$. Then the release mechanism using the posterior distributio

Figures (3)

  • Figure 1: Illustration of the problem setting of multi-task learning under the privacy and decentralization constraints. Left: Each network participant ($s=1,\ldots, S$) holds its own dataset $\mathcal{D}^s$ privately and builds its own machine learning model with the hope that other participants' data would help improve the model. Right: An example of a peer-to-peer (P2P) communication network. A 3-regular expander graph called the cycle with inverse chord is shown for $S=31$ participants.
  • Figure 2: Overall architecture of the proposed multi-task VAE, which operates under decentralized and privacy-preserving constraints.
  • Figure 3: Illustration of (a) differential privacy and (b) the entropy $\ell$-diversity. In the latter, the uniform distribution gives the maximum entropy.

Theorems & Definitions (6)

  • Definition 1: Rényi differential privacy mironov2017renyi
  • Theorem 1
  • proof
  • Definition 2: entropy $\ell$-diversity Machanavajjhala07ACMtrans
  • Theorem 2
  • proof