Table of Contents
Fetching ...

Majority Vote for Distributed Differentially Private Sign Selection

Weidong Liu, Jiyuan Tu, Xiaojun Mao, Xi Chen

TL;DR

This work addresses sign recovery in a distributed data setting under distributed group differential privacy ($(\epsilon,\delta)$-DGDP) by introducing DPVote, a general Majority Vote framework that uses a peeling step and the exponential mechanism to privately aggregate coordinate-wise sign estimates across machines. The method is applied to sparse mean estimation and sparse linear regression, achieving sign-consistency with minimax-optimal signal strength on the order of $O(\sqrt{\log p / N})$ while preserving DGDP and maintaining low communication costs. Theoretical guarantees are complemented by simulations showing competitive performance relative to non-private baselines and superiority over existing private approaches, with robust behavior across varying numbers of machines and privacy budgets. Overall, DPVote provides a principled, scalable approach to private distributed sign selection and has potential extensions to other sparse estimation problems and secure multi-party implementations.

Abstract

Privacy-preserving data analysis has become more prevalent in recent years. In this study, we propose a distributed group differentially private Majority Vote mechanism, for the sign selection problem in a distributed setup. To achieve this, we apply the iterative peeling to the stability function and use the exponential mechanism to recover the signs. For enhanced applicability, we study the private sign selection for mean estimation and linear regression problems, in distributed systems. Our method recovers the support and signs with the optimal signal-to-noise ratio as in the non-private scenario, which is better than contemporary works of private variable selections. Moreover, the sign selection consistency is justified by theoretical guarantees. Simulation studies are conducted to demonstrate the effectiveness of the proposed method.

Majority Vote for Distributed Differentially Private Sign Selection

TL;DR

This work addresses sign recovery in a distributed data setting under distributed group differential privacy (-DGDP) by introducing DPVote, a general Majority Vote framework that uses a peeling step and the exponential mechanism to privately aggregate coordinate-wise sign estimates across machines. The method is applied to sparse mean estimation and sparse linear regression, achieving sign-consistency with minimax-optimal signal strength on the order of while preserving DGDP and maintaining low communication costs. Theoretical guarantees are complemented by simulations showing competitive performance relative to non-private baselines and superiority over existing private approaches, with robust behavior across varying numbers of machines and privacy budgets. Overall, DPVote provides a principled, scalable approach to private distributed sign selection and has potential extensions to other sparse estimation problems and secure multi-party implementations.

Abstract

Privacy-preserving data analysis has become more prevalent in recent years. In this study, we propose a distributed group differentially private Majority Vote mechanism, for the sign selection problem in a distributed setup. To achieve this, we apply the iterative peeling to the stability function and use the exponential mechanism to recover the signs. For enhanced applicability, we study the private sign selection for mean estimation and linear regression problems, in distributed systems. Our method recovers the support and signs with the optimal signal-to-noise ratio as in the non-private scenario, which is better than contemporary works of private variable selections. Moreover, the sign selection consistency is justified by theoretical guarantees. Simulation studies are conducted to demonstrate the effectiveness of the proposed method.
Paper Structure (24 sections, 17 theorems, 95 equations, 5 figures, 4 algorithms)

This paper contains 24 sections, 17 theorems, 95 equations, 5 figures, 4 algorithms.

Key Result

Lemma 1

For any algorithm ${\mathcal{A}}$ satisfying $\mathrm{GS}_{{\mathcal{A}}}<\infty$, ${\mathcal{A}}_1={\mathcal{A}} + g$, where $g$ is sampled from $\text{\normalfont{Lap}} (\mathrm{GS}_{{\mathcal{A}}}/\epsilon)$, achieves $(\epsilon,0)$-differential privacy.

Figures (5)

  • Figure 1: This figure visualizes the mechanism of the majority vote approach. Denote $S^+$, $S^-$ and $S^c$ as the sets of positives, negatives, and zeros of the true parameter $\boldsymbol{\theta}^*$, respectively. The black dots and white dots in each column represent the estimated positive and negative locations on each local machine. By aggregating these local sign vectors with the proposed method, we can recover the true signs with high probability.
  • Figure 2: The FDR and Power over the number of machines for sparse mean estimation. The number of machines varies from $500$ to $1500$, the local sample size is $500$, and the dimension $p$ is $500$.
  • Figure 3: The FDR and Power over the privacy level $\epsilon$ for sparse mean estimation. The local sample size is $500$, the number of machines is $800$, and the dimension $p$ is $500$.
  • Figure 4: The FDR and Power over the number of machines for sparse linear regression. The number of machines varies from $500$ to $1500$, the local sample size is $500$, and the dimension $p$ is $500$.
  • Figure 5: The FDR and Power over the privacy level $\epsilon$ for sparse linear regression. The local sample size is $500$, the number of machines is $800$, and the dimension $p$ is $500$.

Theorems & Definitions (30)

  • Definition 1
  • Definition 2: Global Sensitivity
  • Lemma 1: The Laplace mechanism, Theorem 3.6 of dwork_aaron.2014
  • Lemma 2: Composition theorem, Theorem B.1 of dwork_aaron.2014
  • Lemma 3: Advanced composition theorem, Corollary 3.21 in dwork_aaron.2014
  • Lemma 4
  • Theorem 1: Differential privacy of $\text{\normalfont{DPVote}}$
  • Theorem 2: Sign consistency of $\text{\normalfont{DPVote}}$
  • Theorem 3
  • Theorem 4
  • ...and 20 more