Table of Contents
Fetching ...

Online Distribution Learning with Local Private Constraints

Jin Sima, Changlong Wu, Olgica Milenkovic, Wojciech Szpankowski

TL;DR

The paper analyzes online conditional distribution learning under local differential privacy with unbounded label sets, formalizing a minimax KL-risk objective. It proves a fundamental lower bound of $\Omega\left(\frac{1}{\epsilon}\sqrt{KT}\right)$ and presents near-matching upper bounds using an EXP3-inspired privatization scheme with log-likelihood perturbations, clipping, and a single-coordinate noise reduction, plus a pure-DP variant achieving similar rates. The approach bridges online learning with private probability estimation and shows how KL-risk translates to averaged TV-risk via Pinsker’s inequality, recovering batch results in the non-interactive setting. Together, these results illuminate the limits and design of privacy-preserving online distribution learning when the label alphabet is unbounded and highlight techniques that tame log-likelihood sensitivity under local privacy constraints.

Abstract

We study the problem of online conditional distribution estimation with \emph{unbounded} label sets under local differential privacy. Let $\mathcal{F}$ be a distribution-valued function class with unbounded label set. We aim at estimating an \emph{unknown} function $f\in \mathcal{F}$ in an online fashion so that at time $t$ when the context $\boldsymbol{x}_t$ is provided we can generate an estimate of $f(\boldsymbol{x}_t)$ under KL-divergence knowing only a privatized version of the true labels sampling from $f(\boldsymbol{x}_t)$. The ultimate objective is to minimize the cumulative KL-risk of a finite horizon $T$. We show that under $(ε,0)$-local differential privacy of the privatized labels, the KL-risk grows as $\tildeΘ(\frac{1}ε\sqrt{KT})$ upto poly-logarithmic factors where $K=|\mathcal{F}|$. This is in stark contrast to the $\tildeΘ(\sqrt{T\log K})$ bound demonstrated by Wu et al. (2023a) for bounded label sets. As a byproduct, our results recover a nearly tight upper bound for the hypothesis selection problem of gopi et al. (2020) established only for the batch setting.

Online Distribution Learning with Local Private Constraints

TL;DR

The paper analyzes online conditional distribution learning under local differential privacy with unbounded label sets, formalizing a minimax KL-risk objective. It proves a fundamental lower bound of and presents near-matching upper bounds using an EXP3-inspired privatization scheme with log-likelihood perturbations, clipping, and a single-coordinate noise reduction, plus a pure-DP variant achieving similar rates. The approach bridges online learning with private probability estimation and shows how KL-risk translates to averaged TV-risk via Pinsker’s inequality, recovering batch results in the non-interactive setting. Together, these results illuminate the limits and design of privacy-preserving online distribution learning when the label alphabet is unbounded and highlight techniques that tame log-likelihood sensitivity under local privacy constraints.

Abstract

We study the problem of online conditional distribution estimation with \emph{unbounded} label sets under local differential privacy. Let be a distribution-valued function class with unbounded label set. We aim at estimating an \emph{unknown} function in an online fashion so that at time when the context is provided we can generate an estimate of under KL-divergence knowing only a privatized version of the true labels sampling from . The ultimate objective is to minimize the cumulative KL-risk of a finite horizon . We show that under -local differential privacy of the privatized labels, the KL-risk grows as upto poly-logarithmic factors where . This is in stark contrast to the bound demonstrated by Wu et al. (2023a) for bounded label sets. As a byproduct, our results recover a nearly tight upper bound for the hypothesis selection problem of gopi et al. (2020) established only for the batch setting.
Paper Structure (18 sections, 14 theorems, 62 equations, 2 algorithms)

This paper contains 18 sections, 14 theorems, 62 equations, 2 algorithms.

Key Result

Theorem 1

There exists a finite class $\mathcal{F}$ of size $K$ with $|\mathcal{Y}|\le K$ such that for any $(\epsilon,0)$-local differential private mechanism and learning rules, the KL-risk is lower bounded by $\Omega(\frac{1}{\epsilon}\sqrt{KT})$.

Theorems & Definitions (23)

  • Theorem 1: Lower Bound
  • Theorem 2: Upper Bound
  • Definition 1
  • Theorem 3
  • proof : Sketch of Proof
  • Lemma 1: shalev2014understanding
  • Lemma 2: steinke2022composition
  • Lemma 3
  • proof
  • Lemma 4
  • ...and 13 more