Table of Contents
Fetching ...

The Role of Randomness in Stability

Max Hopkins, Shay Moran

TL;DR

This work characterizes how much randomness is necessary to achieve stability in learning, tying replication and DP to a task's intrinsic stability via a weak-to-strong boosting principle. It shows that the randomness complexity of replicability and DP is controlled by the best deterministic replication probability (global stability) and that PAC learning with finite Littlestone dimension has bounded randomness cost, scaling with the excess error. The results provide tight relationships between global stability, certificate complexity, and DP complexity, and they resolve an open question by establishing an error-dependent bound in agnostic PAC learning governed by the Littlestone dimension. These insights yield practical guidance for designing randomness-efficient, stable learning algorithms and clarifying the limits of DP and replicability in complex tasks.

Abstract

Stability is a central property in learning and statistics promising the output of an algorithm $A$ does not change substantially when applied to similar datasets $S$ and $S'$. It is an elementary fact that any sufficiently stable algorithm (e.g.\ one returning the same result with high probability, satisfying privacy guarantees, etc.) must be randomized. This raises a natural question: can we quantify how much randomness is needed for algorithmic stability? We study the randomness complexity of two influential notions of stability in learning: replicability, which promises $A$ usually outputs the same result when run over samples from the same distribution (and shared random coins), and differential privacy, which promises the output distribution of $A$ remains similar under neighboring datasets. The randomness complexity of these notions was studied recently in (Dixon et al. ICML 2024) and (Cannone et al. ITCS 2024) for basic $d$-dimensional tasks (e.g. estimating the bias of $d$ coins), but little is known about the measures more generally or in complex settings like classification. Toward this end, we prove a `weak-to-strong' boosting theorem for stability: the randomness complexity of a task $M$ (either under replicability or DP) is tightly controlled by the best replication probability of any deterministic algorithm solving the task, a weak measure called `global stability' that is universally capped at $\frac{1}{2}$ (Chase et al. FOCS 2023). Using this, we characterize the randomness complexity of PAC Learning: a class has bounded randomness complexity iff it has finite Littlestone dimension, and moreover scales at worst logarithmically in the excess error of the learner. This resolves a question of (Chase et al. STOC 2024) who asked for such a characterization in the equivalent language of (error-dependent) `list-replicability'.

The Role of Randomness in Stability

TL;DR

This work characterizes how much randomness is necessary to achieve stability in learning, tying replication and DP to a task's intrinsic stability via a weak-to-strong boosting principle. It shows that the randomness complexity of replicability and DP is controlled by the best deterministic replication probability (global stability) and that PAC learning with finite Littlestone dimension has bounded randomness cost, scaling with the excess error. The results provide tight relationships between global stability, certificate complexity, and DP complexity, and they resolve an open question by establishing an error-dependent bound in agnostic PAC learning governed by the Littlestone dimension. These insights yield practical guidance for designing randomness-efficient, stable learning algorithms and clarifying the limits of DP and replicability in complex tasks.

Abstract

Stability is a central property in learning and statistics promising the output of an algorithm does not change substantially when applied to similar datasets and . It is an elementary fact that any sufficiently stable algorithm (e.g.\ one returning the same result with high probability, satisfying privacy guarantees, etc.) must be randomized. This raises a natural question: can we quantify how much randomness is needed for algorithmic stability? We study the randomness complexity of two influential notions of stability in learning: replicability, which promises usually outputs the same result when run over samples from the same distribution (and shared random coins), and differential privacy, which promises the output distribution of remains similar under neighboring datasets. The randomness complexity of these notions was studied recently in (Dixon et al. ICML 2024) and (Cannone et al. ITCS 2024) for basic -dimensional tasks (e.g. estimating the bias of coins), but little is known about the measures more generally or in complex settings like classification. Toward this end, we prove a `weak-to-strong' boosting theorem for stability: the randomness complexity of a task (either under replicability or DP) is tightly controlled by the best replication probability of any deterministic algorithm solving the task, a weak measure called `global stability' that is universally capped at (Chase et al. FOCS 2023). Using this, we characterize the randomness complexity of PAC Learning: a class has bounded randomness complexity iff it has finite Littlestone dimension, and moreover scales at worst logarithmically in the excess error of the learner. This resolves a question of (Chase et al. STOC 2024) who asked for such a characterization in the equivalent language of (error-dependent) `list-replicability'.

Paper Structure

This paper contains 24 sections, 17 theorems, 45 equations, 1 figure, 1 algorithm.

Key Result

Theorem 1.1

Let $\mathcal{M}$ be any statistical task. Then: Moreover, the number of random bits required to achieve $\rho$-replicability is at most $\lceil C_{\text{Glob}}+\log(1/\rho) \rceil$.

Figures (1)

  • Figure 1: Thresholding procedure for $C_{\text{Glob}}=2$ and $T=7$. Blue dots denote the $4$ heavy hitters, one of which $p(y_1)$ is known to be far from any threshold. This leaves $4$ (green) thresholds with no nearby heavy-hitters out of $7$, so $\rho \approx 4/7 > \frac{1}{2}$, and $C_{\text{Rep}} \leq 3$.

Theorems & Definitions (39)

  • Theorem 1.1: Stability vs Replicability (\ref{['thm:equiv']})
  • Theorem 1.2: Stability vs DP (Informal \ref{['thm:DP-list']})
  • Theorem 1.3: Stability vs User-Level DP (Informal \ref{['thm:list-DP-user']})
  • Theorem 1.4: The Certificate Complexity of Agnostic Learning (\ref{['thm:stable-agnostic']})
  • Definition 2.1: Statistical tasks
  • Definition 2.2: Global stability
  • Definition 2.3: Replicability
  • Definition 2.4: Certificate Complexity
  • Definition 2.5: Differential Privacy (dwork2006calibratingdwork2006our)
  • Definition 2.6: Parametrized DP Complexity
  • ...and 29 more