Table of Contents
Fetching ...

Linear-Time User-Level DP-SCO via Robust Statistics

Badih Ghazi, Ravi Kumar, Daogao Liu, Pasin Manurangsi

TL;DR

This work tackles user-level DP-SCO by introducing a linear-time algorithm that uses robust statistics (median and trimmed mean) to bound the sensitivity of SGD iterates, thereby reducing private gradient noise. A coordinate-wise, debiased robust mean estimator is embedded into SGD, and a smoothed concentration test within a localization framework yields privacy guarantees and improved utility. The authors prove a near-optimal utility bound of $\tilde{O}\left(\frac{d}{\sqrt{nm}}+\frac{d^{3/2}}{n\sqrt{m}\varepsilon^2}\right)$ and provide a matching lower bound up to logarithmic factors and $\varepsilon$-dependence, under the assumption of diagonally dominant Hessians and $\ell_\infty$-bounded domain. This work advances privacy-preserving optimization by leveraging robust statistics to achieve stable, scalable user-level DP-SCO with improved privacy-utility trade-offs, and lays groundwork for future extensions beyond the current assumptions.

Abstract

User-level differentially private stochastic convex optimization (DP-SCO) has garnered significant attention due to the paramount importance of safeguarding user privacy in modern large-scale machine learning applications. Current methods, such as those based on differentially private stochastic gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility due to the need to privatize every intermediate iterate. In this work, we introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges. Our approach uniquely bounds the sensitivity of all intermediate iterates of SGD with gradient estimation based on robust statistics, thereby significantly reducing the gradient estimation noise for privacy purposes and enhancing the privacy-utility trade-off. By sidestepping the repeated privatization required by previous methods, our algorithm not only achieves an improved theoretical privacy-utility trade-off but also maintains computational efficiency. We complement our algorithm with an information-theoretic lower bound, showing that our upper bound is optimal up to logarithmic factors and the dependence on $ε$. This work sets the stage for more robust and efficient privacy-preserving techniques in machine learning, with implications for future research and application in the field.

Linear-Time User-Level DP-SCO via Robust Statistics

TL;DR

This work tackles user-level DP-SCO by introducing a linear-time algorithm that uses robust statistics (median and trimmed mean) to bound the sensitivity of SGD iterates, thereby reducing private gradient noise. A coordinate-wise, debiased robust mean estimator is embedded into SGD, and a smoothed concentration test within a localization framework yields privacy guarantees and improved utility. The authors prove a near-optimal utility bound of and provide a matching lower bound up to logarithmic factors and -dependence, under the assumption of diagonally dominant Hessians and -bounded domain. This work advances privacy-preserving optimization by leveraging robust statistics to achieve stable, scalable user-level DP-SCO with improved privacy-utility trade-offs, and lays groundwork for future extensions beyond the current assumptions.

Abstract

User-level differentially private stochastic convex optimization (DP-SCO) has garnered significant attention due to the paramount importance of safeguarding user privacy in modern large-scale machine learning applications. Current methods, such as those based on differentially private stochastic gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility due to the need to privatize every intermediate iterate. In this work, we introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges. Our approach uniquely bounds the sensitivity of all intermediate iterates of SGD with gradient estimation based on robust statistics, thereby significantly reducing the gradient estimation noise for privacy purposes and enhancing the privacy-utility trade-off. By sidestepping the repeated privatization required by previous methods, our algorithm not only achieves an improved theoretical privacy-utility trade-off but also maintains computational efficiency. We complement our algorithm with an information-theoretic lower bound, showing that our upper bound is optimal up to logarithmic factors and the dependence on . This work sets the stage for more robust and efficient privacy-preserving techniques in machine learning, with implications for future research and application in the field.

Paper Structure

This paper contains 41 sections, 31 theorems, 77 equations.

Key Result

Theorem 3.1

Under Assumptions assum:lispchitz_smooth and assump:dia_dominant, suppose $\beta\le\frac{G}{D}(\frac{\sqrt{n}\varepsilon}{\sqrt{m}\log(nmd/\delta)}+\frac{\sqrt{d\log(1/\delta)\log(nmd)}}{\sqrt{m}\varepsilon})$, $\varepsilon\le O(1),n\ge \log^2(nd/\delta)/\varepsilon$ and $m\le n^{O(\log\log n)}$. Se

Theorems & Definitions (51)

  • Definition 2.1: Lipschitz
  • Definition 2.2: Smooth
  • Definition 2.3: Diagonal Dominance
  • Theorem 3.1
  • Lemma 3.1
  • Remark 3.3
  • Lemma 3.3
  • Lemma 3.3
  • Lemma 3.3: Iteration Sensitivity
  • Lemma 3.3
  • ...and 41 more