Table of Contents
Fetching ...

Distributed quasi-Newton robust estimation under differential privacy

Chuhan Wang, Lixing Zhu, Xuehu Zhu

TL;DR

The paper tackles robust, privacy-preserving distributed M-estimation in the presence of Byzantine nodes by introducing a distributed quasi-Newton scheme that dramatically reduces communication and privacy budget. It leverages a Distributed Composite Quantile (DCQ) estimator to maintain high statistical efficiency under differential privacy, relaxing gradient/Hessian boundedness through sub-Gaussian/sub-exponential tail assumptions. The proposed three-layer approach—a robust initial estimator, a one-stage update, and a second quasi-Newton refinement—achieves near-optimal convergence and asymptotic normality, with explicit DP guarantees via Gaussian mechanisms and composition. Numerical experiments on synthetic data and MNIST demonstrate resilience to Byzantine behavior and privacy constraints, showing practical viability for large-scale, privacy-aware distributed inference. These results offer a scalable, privacy-preserving alternative to gradient-descent or full-Hessian methods in distributed, potentially adversarial environments.

Abstract

For distributed computing with Byzantine machines under Privacy Protection (PP) constraints, this paper develops a robust PP distributed quasi-Newton estimation, which only requires the node machines to transmit five vectors to the central processor with high asymptotic relative efficiency. Compared with the gradient descent strategy which requires more rounds of transmission and the Newton iteration strategy which requires the entire Hessian matrix to be transmitted, the novel quasi-Newton iteration has advantages in reducing privacy budgeting and transmission cost. Moreover, our PP algorithm does not depend on the boundedness of gradients and second-order derivatives. When gradients and second-order derivatives follow sub-exponential distributions, we offer a mechanism that can ensure PP with a sufficiently high probability. Furthermore, this novel estimator can achieve the optimal convergence rate and the asymptotic normality. The numerical studies on synthetic and real data sets evaluate the performance of the proposed algorithm.

Distributed quasi-Newton robust estimation under differential privacy

TL;DR

The paper tackles robust, privacy-preserving distributed M-estimation in the presence of Byzantine nodes by introducing a distributed quasi-Newton scheme that dramatically reduces communication and privacy budget. It leverages a Distributed Composite Quantile (DCQ) estimator to maintain high statistical efficiency under differential privacy, relaxing gradient/Hessian boundedness through sub-Gaussian/sub-exponential tail assumptions. The proposed three-layer approach—a robust initial estimator, a one-stage update, and a second quasi-Newton refinement—achieves near-optimal convergence and asymptotic normality, with explicit DP guarantees via Gaussian mechanisms and composition. Numerical experiments on synthetic data and MNIST demonstrate resilience to Byzantine behavior and privacy constraints, showing practical viability for large-scale, privacy-aware distributed inference. These results offer a scalable, privacy-preserving alternative to gradient-descent or full-Hessian methods in distributed, potentially adversarial environments.

Abstract

For distributed computing with Byzantine machines under Privacy Protection (PP) constraints, this paper develops a robust PP distributed quasi-Newton estimation, which only requires the node machines to transmit five vectors to the central processor with high asymptotic relative efficiency. Compared with the gradient descent strategy which requires more rounds of transmission and the Newton iteration strategy which requires the entire Hessian matrix to be transmitted, the novel quasi-Newton iteration has advantages in reducing privacy budgeting and transmission cost. Moreover, our PP algorithm does not depend on the boundedness of gradients and second-order derivatives. When gradients and second-order derivatives follow sub-exponential distributions, we offer a mechanism that can ensure PP with a sufficiently high probability. Furthermore, this novel estimator can achieve the optimal convergence rate and the asymptotic normality. The numerical studies on synthetic and real data sets evaluate the performance of the proposed algorithm.
Paper Structure (20 sections, 13 theorems, 48 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 20 sections, 13 theorems, 48 equations, 6 figures, 1 table, 1 algorithm.

Key Result

Lemma 2.1

(dwork2014algorithmic) Given a function $\mathcal{M}:\mathcal{X}^n\to\mathbb{R}^p$ with the $\ell_2$-sensitivity $\Delta$ and a dataset $\mathbf{X}\subset\mathcal{X}^n$, assume $\sigma\geq\frac{\sqrt{2\log(1.25/\delta)}\Delta}{\varepsilon}$. The following Gaussian mechanism yields $(\varepsilon,\del where $\mathbf{I}_p$ is a $p\times p$ identity matrix.

Figures (6)

  • Figure 1: Logistic regression: $\varepsilon$ varies from 4 to 50, $\delta=0.05$, $p=10$, $m=500$ or $1000$, $n=4000$ or $2000$, total sample size $N=2000000$, Byzantine machine proportion $\alpha=0$ or $10\%$.
  • Figure 2: Logistic regression: $\varepsilon$ varies from 4 to 50, $\delta=0.05$, $p=20$, $m=500$ or $1000$, $n=4000$ or $2000$, total sample size $N=2000000$, Byzantine machine proportion $\alpha=0$ or $10\%$.
  • Figure 3: Logistic regression: $m$ varies from 500 to 5000, $p=10$ or $20$, $n=1000$, Byzantine machine proportion $\alpha=0$ or $10\%$, $\varepsilon=30$, $\delta=0.05$.
  • Figure 4: Poisson regression: $\varepsilon$ varies from 4 to 50, $\delta=0.05$, $p=10$, $m=500$ or $1000$, $n=4000$ or $2000$, total sample size $N=2000000$, Byzantine machine proportion $\alpha=0$ or $10\%$.
  • Figure 5: Poisson regression: $\varepsilon$ varies from 4 to 50, $\delta=0.05$, $p=20$, $m=500$ or $1000$, $n=4000$ or $2000$, total sample size $N=2000000$, Byzantine machine proportion $\alpha=0$ or $10\%$.
  • ...and 1 more figures

Theorems & Definitions (21)

  • Lemma 2.1
  • Theorem 3.1
  • Remark 3.1
  • Lemma 4.1
  • Lemma 4.2
  • Theorem 4.1
  • Remark 4.1
  • Theorem 4.2
  • Remark 4.2
  • Theorem 4.3
  • ...and 11 more