Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

Hilal Asi; Daogao Liu; Kevin Tian

Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

Hilal Asi, Daogao Liu, Kevin Tian

TL;DR

This work addresses DP-SCO under heavy-tailed gradient distributions by replacing uniform Lipschitz assumptions with $k$-th moment bounds and introducing a reduction-based, localization-driven framework. The authors develop a DP-ERM solver based on clipped-DP-SGD, establish population-level localization, and compose these to obtain near-optimal private-sCO rates that match known lower bounds up to polylog factors, with enhanced results under known Lipschitz constants and for smooth objective classes. They also present fast, linear-time algorithms for smooth settings, employing stability analyses and the sparse vector technique to ensure privacy. A key specialization to generalized linear models yields optimal rates with linear gradient-query complexity, highlighting practical efficiency and broad applicability. Overall, the paper advances private optimization under heavy tails by unifying reductions, localization, and efficient private-SGD-like methods to achieve near-optimal guarantees.

Abstract

We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a $k^{\text{th}}$-moment bound on the Lipschitz constants of sample functions rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error $G_2 \cdot \frac 1 {\sqrt n} + G_k \cdot (\frac{\sqrt d}{nε})^{1 - \frac 1 k}$ under $(ε, δ)$-approximate differential privacy, up to a mild $\textup{polylog}(\frac{1}δ)$ factor, where $G_2^2$ and $G_k^k$ are the $2^{\text{nd}}$ and $k^{\text{th}}$ moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [Lowy and Razaviyayn 2023]. We further give a suite of private algorithms in the heavy-tailed setting which improve upon our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models.

Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

TL;DR

This work addresses DP-SCO under heavy-tailed gradient distributions by replacing uniform Lipschitz assumptions with

-th moment bounds and introducing a reduction-based, localization-driven framework. The authors develop a DP-ERM solver based on clipped-DP-SGD, establish population-level localization, and compose these to obtain near-optimal private-sCO rates that match known lower bounds up to polylog factors, with enhanced results under known Lipschitz constants and for smooth objective classes. They also present fast, linear-time algorithms for smooth settings, employing stability analyses and the sparse vector technique to ensure privacy. A key specialization to generalized linear models yields optimal rates with linear gradient-query complexity, highlighting practical efficiency and broad applicability. Overall, the paper advances private optimization under heavy tails by unifying reductions, localization, and efficient private-SGD-like methods to achieve near-optimal guarantees.

Abstract

We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a

-moment bound on the Lipschitz constants of sample functions rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error

under

-approximate differential privacy, up to a mild

factor, where

and

are the

and

moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [Lowy and Razaviyayn 2023]. We further give a suite of private algorithms in the heavy-tailed setting which improve upon our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models.

Paper Structure (27 sections, 31 theorems, 126 equations)

This paper contains 27 sections, 31 theorems, 126 equations.

Introduction
Our contributions
Near-optimal rates for heavy-tailed DP-SCO (\ref{['sec:lose_log']}).
Optimal rates with known Lipschitz constants (\ref{['sec:lip_known']}).
Efficient algorithms for smooth functions (Sections \ref{['sec:smooth']} and \ref{['sec:smooth_glm']}).
Prior work
Preliminaries
General notation.
Differential privacy.
Private SCO.
Heavy-Tailed Private SCO
Strongly convex DP-ERM solver
Localizing regularized population loss minimizers
Population-level localization
Strongly convex heavy-tailed private SCO via localization
...and 12 more sections

Key Result

Lemma 1

RDP has the following properties.

Theorems & Definitions (69)

Definition 1: Differential privacy
Definition 2: Rényi DP
Definition 3: CDP
Lemma 1: Mironov17
Definition 4: $k$-heavy-tailed private SCO
Lemma 2
proof
Proposition 1
proof
Lemma 3
...and 59 more

Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

TL;DR

Abstract

Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (69)