Differentially Private Distributed Stochastic Optimization with Time-Varying Sample Sizes
Jimin Wang, Ji-Feng Zhang
TL;DR
This paper tackles privacy leakage in distributed stochastic optimization by designing differentially private, time-varying sample-size schemes built on two-time-scale stochastic approximation. It introduces output-perturbation and gradient-perturbation methods that achieve $\varepsilon$-differential privacy with a finite cumulative privacy budget for infinite iterations, while preserving almost-sure and mean-square convergence to the optimum. The analysis uses Lyapunov function techniques to characterize convergence rates and shows how privacy noise inflates the convergence time and rate constants. Numerical experiments on distributed MNIST CNN training and parameter estimation illustrate practical privacy-utility trade-offs and confirm convergence under realistic privacy constraints.
Abstract
Differentially private distributed stochastic optimization has become a hot topic due to the urgent need of privacy protection in distributed stochastic optimization. In this paper, two-time scale stochastic approximation-type algorithms for differentially private distributed stochastic optimization with time-varying sample sizes are proposed using gradient- and output-perturbation methods. For both gradient- and output-perturbation cases, the convergence of the algorithm and differential privacy with a finite cumulative privacy budget $\varepsilon$ for an infinite number of iterations are simultaneously established, which is substantially different from the existing works. By a time-varying sample sizes method, the privacy level is enhanced, and differential privacy with a finite cumulative privacy budget $\varepsilon$ for an infinite number of iterations is established. By properly choosing a Lyapunov function, the algorithm achieves almost-sure and mean-square convergence even when the added privacy noises have an increasing variance. Furthermore, we rigorously provide the mean-square convergence rates of the algorithm and show how the added privacy noise affects the convergence rate of the algorithm. Finally, numerical examples including distributed training on a benchmark machine learning dataset are presented to demonstrate the efficiency and advantages of the algorithms.
